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Abstract 

Real-time  computer  systems  have  become  more  and  more  important  in  many  ap¬ 
plications,  such  as  robot  control,  flight  control,  and  other  mission-critical  jobs.  The 
correctness  of  the  system  depends  on  the  temporal  correctness  as  well  as  the  functional 
correctness  of  the  tasks.  We  propose  a  scheduling  algorithm  based  on  an  analytic 
model.  Our  goal  is  to  derive  the  optimal  schedule  for  a  given  set  of  aperiodic  tasks 
such  that  the  number  of  rejected  tasks  is  minimized,  and  then  the  finish  time  of  the 
schedule  is  also  minimized.  The  scheduling  problem  with  a  nonpreemptive  discipline 
in  a  uniprocessor  system  is  considered.  We  first  show  that  if  a  total  ordering  is  given, 
this  can  be  done  in  0(ti2)  time  by  djmamic  programming  technique,  where  n  is  the 
size  of  the  task  set.  When  the  restriction  of  the  total  ordering  is  released,  it  is  known 

'This  work  is  supported  in  part  by  Honeywell  under  N00014-91-C-0195  and  Army/Phillips  under  DASG- 
60-92-C-0055.  The  views,  opinions,  and/or  findings  contained  in  this  report  are  those  of  the  author(s)  and 
should  not  be  interpreted  as  representing  the  official  policies,  either  expressed  or  implied,  of  Honeywell  or 
Army /Phillips. 
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to  be  NP-complete  [3].  We  discuss  the  super  sequence  [18]  which  has  been  shown  to 
be  useful  in  reducing  the  search  space  for  testing  the  feasibility  of  a  task  set.  By  ex¬ 
tending  the  idea  and  introducing  the  concept  of  conformation,  the  scheduling  process 
can  be  divided  into  two  phases:  computing  the  pruned  search  space,  and  computing 
the  optimal  schedule  for  each  sequence  in  the  search  space.  While  the  complexity  of 
the  algorithm  in  the  worst  case  remains  exponential,  our  simulation  results  show  that 
the  cost  is  reasonable  for  the  average  case. 


1  Introduction 


In  a  hard  real-time  system,  the  computer  is  required  to  support  the  execution  of  applications 
in  -which  the  timing  constraints  of  the  tasks  are  specified.  The  correctness  of  the  system 
depends  on  the  temporal  correctness  as  well  as  the  functional  correctness  of  the  tasks.  Failure 
to  satisfy  the  timing  constraints  can  incur  fatal  errors.  Once  a  task  is  accepted  by  the  system, 
the  system  should  be  able  to  finish  it  under  the  timing  constraint  of  the  task.  A  task  T,-  can 
be  characterized  as  a  triple  of  (r,,  c,-,  d,),  representing  the  ready  time,  the  computation  time, 
and  the  deadline  of  the  task,  respectively.  A  task  can  not  be  started  before  its  ready  time. 
Once  started,  the  task  must  use  the  processor  for  a  consecutive  period  of  c,-,  and  be  finished 
by  its  deadline.  The  task  set  is  represented  as  T  =  {Tj,  J2, ..., Tn}.  A  task  set  is  feasible  if 
there  exists  a  schedule  in  which  all  the  tasks  in  the  task  set  can  meet  their  timing  constraints. 
Scheduling  is  a  process  of  binding  starting  times  to  the  tasks  such  that  each  task  executes 
according  to  the  schedule.  A  sequence  S  =  (T/, Tf , . . .  ,T/),  where  k  <  n.  Tf  represents 
the  ith  task  of  the  sequence  5  for  any  1  <  :  <  k.  A  sequence  specifies  the  order  in  which  the 
tasks  are  executed.  Without  confusion,  a  schedule  can  be  represented  as  a  sequence.  How 
to  schedule  the  tasks  so  that  the  timing  constraints  are  met  is  nontrivial.  Many  scheduling 
problems  are  known  to  be  intractable  [3]  in  that  finding  the  optimal  schedule  requires  large 
amounts  of  computations  to  be  carried  out. 

The  approaches  adopted  to  date  for  scheduling  algorithms  can  be  generally  classified 
into  two  categories.  One  approach  is  to  assign  priorities  to  tasks  so  that  the  tasks  can  be 
scheduled  according  to  their  priorities  [1,  7,  8,  10, 12, 15, 14].  This  approach  is  called  priority 
based  scheduling.  The  priority  can  be  determined  by  deadline,  execution  time,  resource 
requirement,  laxity,  period,  or  can  be  programmer-denned  [4].  The  other  is  time  based 
scheduling  approach  (9,  13].  A  time  based  scheduler  generates  as  an  output  a  calendar  which 
specifies  the  time  instants  at  which  the  tasks  start  and  finish. 

Generally  speaking,  scheduling  for  aperiodic  task  sets  without  preemption  is  NP-complete 
[3].  Due  to  the  intractability,  several  search  algorithms  [11, 17, 19,  20]  are  proposed  for  com¬ 
puting  optimal  or  suboptimal  schedules.  Analytic  techniques  may  also  be  used  for  optimal 
scheduling.  A  dominance  concept  by  Erschler  et  al  [2]  was  proposed  to  reduce  the  search 
space  for  checking  the  feasibility  of  task  sets.  They  explored  the  relations  among  the  tasks 
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and  determined  the  partial  orderings  of  feasible  schedules.  Yuan  and  Agrawala  [18]  proposed 
decomposition  methods  to  substantially  reduce  the  search  spare  based  on  the  dominance 
concept.  A  task  set  is  decomposed  into  subsets  so  that  each  subset  can  be  scheduled  in¬ 
dependently.  A  super  sequence  is  constructed  to  reduce  search  space  further.  Saksena  and 
Agrawala  [13]  investigated  the  technique  of  temporal  analysis  serving  as  a  pre-processing 
stage  for  scheduling.  The  idea  is  to  modify  the  windows  of  two  partially  ordered  tasks 
which  are  generated  by  the  temporal  relations  so  that  more  partial  orderings  of  tasks  may 
be  generated  recursively. 

The  time  based  model  is  employed  by  several  real-time  operating  systems  currently  being 
developed,  including  MARUTI  [5],  MARS  [6],  and  Spring  [16].  In  this  paper,  we  study  an 
analytic  approach  to  optimal  scheduling  under  the  time  based  model.  When  complicated 
timing  constraints  and  task  interdependency  are  taken  into  consideration,  the  schedulability 
analysis  of  priority  based  scheduling  algorithms  becomes  much  more  difficult.  By  analytic 
approach,  we  believe  that  the  time  based  scheduling  algorithm  and  analysis  require  reason¬ 
able  amounts  of  computations  to  produce  a  feasible  schedule. 

The  rest  of  this  paper  is  organized  as  follows.  In  section  2,  we  describe  how  to  compute 
the  optimal  schedule  for  a  sequence.  In  section  3,  releasing  the  restriction  of  total  ordering 
on  a  sequence,  we  present  the  approach  to  computing  the  optimal  schedule  for  a  task  set. 
Related  theorems  are  also  presented.  In  section  4,  a  simulation  experiment  is  conducted  to 
compare  the  performance  of  different  algorithms.  The  last  section  is  our  conclusions. 

2  Scheduling  a  Sequence 

The  size  of  a  sequence  (task  set)  is  the  number  of  tasks  in  tbe  sequence(task  set),  and  is 
denoted  by  ]5j  ([r|).  A  sequence  S  is  feasible  if  all  tasks  in  5  axe  executed  in  tbe  order  of 
tbe  sequence  and  tbe  timing  constraints  are  satisfied.  For  convenience,  we  further  define  an 
instance ,  I,  to  be  a  sequence  such  that  |I|  =  ]Tj.  We  denote  the  instance  J  by 

Notice  tbat  {}  is  used  to  represent  a  task  set,  and  ()  a  sequence.  Let  T„-  and  Tj  be  two 
tasks  belonging  lo  sequence  S.  If  T;  is  located  before  Tj  in  tbe  sequence  S,  we  say  that 
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Figure  1:  An  instance  1  =  (T/,Tj, 


(Ti,Tj)  conforms  to  5.  A  sequence  51  conforms  to  a  sequence  52,  if,  for  any  T,-  and  Tj, 
(Ti,Tj)  conforming  to  51  implies  (T,-,  Tj)  conforming  to  52.  We  use  a(k)  to  represent  the 
optimal  schedule  of  (T/,T2;, . . . ,  T/)  in  the  sense  that  for  any  feasible  sequence  5  conforming 
to  (T/,  T/ , . . . ,  T/),  either 

|S|  <  K*)l, 

or 

|5|  =  |a(fc)|  and  }s>U[k),  (1) 

•where  and  /„(*)  is  the  finish  time  of  5  and  a(k)  respectively.  a(k)  is  thus  the  optimal 
schedule  for  the  first  k  tasks  of  I.  The  optimal  schedule  for  an  instance  1  can  thus  be 
represented  by  cr(n).  For  simplicity,  let  u*  =  jc(fc)|.  In  this  section,  we  will  discuss  the 
scheduling  for  an  instance.  However,  the  approach  is  generally  applicable  to  any  sequence. 

2.1  Preliminary 

We  assume  that  r,-  +  a  <  di  holds  for  each  task  T,-  in  the  task  set  T.  At  the  first  glance,  one 
may  attempt  to  compute  a(k)  based  on  cr(k  —  1).  However,  with  careful  examination,  we 
can  find  that  merely  computing  a(k  —  1)  does  not  suffice  to  compute  c(k).  This  is  illustrated 
by  the  example  in  Figure  1.  From  this  example,  we  can  obtain 

<7(1)  =  (T‘) 
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c(2)  =  ( T’X i). 


At  the  next  step,  £7(2)  ©  (Tj)  is  not  feasible,  where  the  operator  ©  means  concatenation 
of  two  sequences.  One  task  must  be  rejected,  which  is  IT/  in  this  case.  Hence,  we  got 

,7(3)  =  <7(2)  =  (Tl,Tj). 

A  problem  arises  at  the  next  step.  <x(3)  ©  (Tj)  is  not  feasible  either.  If  we  try  to  fix  it 
by  taking  a  task  off  this  sequence,  the  result  is 

cr'(4)  =  a(3)  =  (r/,r/). 

However,  the  correct  result  should  be 

*(4)  = 

Although  both  £r'(4)  and  a(4)  axe  of  the  same  size,  the  latter  comes  with  a  shorter  finish 
time,  which  becomes  significant  at  next  step.  We  get 

c(n,  =  <7(0)  =  «(4)  e  (T’)  =  (r/,r/,r/). 

However,  with  £t'(4),  we  would  have 

<<'(5)  =  a'(4)  =(T',  7i). 

.This  example  shows  that  merely  computing  &{k  —  1)  does  not  suffice  to  compute  a(k). 
Wffien  o{k  —  1)  is  obtained,  it  can  not  be  predicted  as  to  which  tasks  would  be  included  in 
<y(k).  The  approach  has  to  be  modified  as  follows. 
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2.2  Sequence-Scheduler  Algorithm 

We  denote  by  S(k,j)  the  sequence  such  that  S(kyj)  conforms  to  . .  ,7*)  and  |5(fc,jjj  = 

j,  where  j  <  |cr(fc)|.  S(k,j)  represents  any  sequence  of  j  tasks  picked  up  from  the  first  k 
tasks  of  5.  We  further  define  a  sequence,  denoted  by  a(k,j),  to  be  the  optimal  schedule  with 
degree  j  for  (T/,  T/, . . . ,  T/)  in  the  sense  that  for  any  feasible  sequence  5(£,j),  we  have 

fcr(kj)  <  fs(kj)- 

Notice  that  a(k)  is  an  abbreviation  of  a(k ,  u*).  If  a  sequence  S(k,j )  is  not  feasible, 
fs(kj)  =  oo. 

We  would  like  to  compute  cr(k,j)  based  on  a(k  —  l,j'),  where  j'  <  j  <  |or(/:)|.  The  basic 
idea  is  as  follows.  We  know  c(k,j )  either  contains  T/  or  not.  If  so,  then  the  other  j  —  l 
tasks  are  picked  up  from  the  first  k  —  1  tasks,  and  a(k  —  1,  j  —  1)  is  one  of  the  best  choices. 
In  this  case,  cr(k,j)  =  a (k  —  1,  j  —  1)  ©  T/.  If  <r(k,j)  does  not  contain  Tl ,  all  of  the  j  tasks 
should  be  picked  up  from  the  first  k  —  1  tasks,  and  a(k  —  l.j)  is  one  of  the  best  choices.  In 
this  case,  cr(k,j)  =  cr(k—  1,  j).  Whether  taking  T/  or  not  is  determined  by  comparing  which 
one  of  the  sequences  comes  with  a  shorter  finish  time.  Therefore,  cr(k,j)  can  be  determined 
by  cr(k  —  l,j  —  1),  and  cr(k  —  l.j).  The  computation  of  a(k  —  1,;)  is  in  turn  based  on 
a(k  —  2 ,j  —  1),  and  cr(k  —  2,  j).  In  general,  at  each  step  k,  we  need  to  compute  &{k,j)  for 
j  =  1,2, . . . ,  |<t(&)|.  The  algorithm  Sequence-Scheduler  in  Figure  2  formalizes  this  idea.  It  is 
worth  mentioning  that  the  condition  of  the  ‘while”  statement  in  the  algorithm  is  designed 
to  let  j  increase  from  1  through  |a(fc)|.  The  correctness  is  verified  in  the  next  section. 

2.3  Verifi cation  of  Sequence- Scheduler  Algorithm 

The  proof  of  the  correctness  of  the  algorithm  along  with  some  related  lemmas  are  given 
below. 

Lemma  1  Let  Si  and  52  be  two  sequences  such  that  fsi  S  fs 7-  If  52  ©  {Ts)  is  feasible, 
then  fsie(Tx)  <  fs2e(Tx)- 
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Algorithm  Sequence-Scheduler 


Input:  an  instance  1  =  (71/,  T/ , . . . ,  T[) 

Output:  the  optimal  schedule  a(n)  —  a(n,u„)  fox  I 

a(0,0)  :=  {)',  Uo  =  0 

for  k  :=  1,2, ...,n 

while  (j  <  uk_i)  or  ((j  =  u*_a  +  1)  and  (cr(fc  —  l,u*_i)  ©  (T*)  is  feasible)) 

^  /c(Jr-lJ-l)e{T/>  <  /»(*-  1J) 

a(fc,i)  :*  cr(fc  -  1,  j  -  1)  ©  (T/) 

else 

:= 

endif 
j  :=  j  -f  1 
endvrhile 
tti  :=  j  -  1 
endfor 


Figure  2:  Sequence-Scheduler  Algorithm 
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PROOF:  This  is  straightforward  via  the  following  equations. 

fsie(Tx)  -  rnax(fs 1 ,  rTx )  +  cTl 
<  max(fsi ,  rjx )  +  cr, 

=  Js2e(TI) 

□ 

Corollary  1  Let  51  and  S2  be  two  sequences  such  that  fs\  <  fsi-  If  £2  ©  S3  is  feasible, 
where  53  is  another  sequence,  then  fsxesz  <  fsi®sz- 

PROOF:  This  is  a  direct  result  of  applying  Lemma  1  repeatedly  through  the  tasks  in  53.  □ 
Lemma  2  u*  =  Uk-y  or  n*  =  u*_i  +  1. 

PROOF:  It  is  obvious  that  Uk  >  Uk- 1,  where  both  Uk  and  Uk-\  are  integers.  Let  us  assume 
that  uk  =  u*_i+a,  and  a  >2.  We  axe  going  to  show  that  this  assumption  does  not  hold.  We 
know  a(k.Uk )  either  contains  T/  or  not.  If  cr(k,Uk)  contains  T/,  we  can  represent  cr(k,Uk ) 
as  S(k  —  1  ,Uk  —  1)  ©  (T/),  by  picking  up  a  proper  sequence  S(k  —  Lit*  —  1).  However, 
from  the  assumption  above,  we  have  Uk- j  =  Uk  —  a  <  Uk  —  I  =  | S(k  —  l7iifc  —  1)|.  This 
contradicts  the  definition  of  On  the  other  hand,  if  c(k,Uk)  does  not  contain  T/,  we 

can  represent  a(k.  Uk )  as  S(k  —  1,  w*).  We  have  =  Uk  —  a  <  u*,  which  is  a  contradiction. 
The  assumption  thus  does  not  hold.  Therefore,  we  have  a  <  1.  D 

From  this  lemma,  a(k,j)  does  exist  for  j  <  u*_j.  Furthermore,  in  the  algorithm,  j  — 
Uk-i  -i- 1  is  tested  to  see  if  u*  =  u*_j  -j- 1. 

Theorem  1  For  k  =  1,2, and  j  =  1,2, if  f?(k—\j—y)Q(Tl)  ^  ij)?  then 
cr(kj)  =  c(k  -  1 J  -  1)  ©  (T/);  otherwise,  a(k,j )  =  cr(fc  -  1,  j). 

PROOF:  The  proof  is  by  induction  on  k.  When  i  =  1,  ly  =  (T/).  Since  it3  <  1  and  (Ty)  is 
feasible,  <r(l,l)  =  (IT/).  It  is  easy  to  come  up  with  the  same  result  through  this  theorem. 
Thus  holds  the  base  case.  We  assume  that  we  can  compute  a {k—  1  ,j),  for  j  —  1, 2, ... ,  Uk-u 


221 


in  the  same  way,  and  consider  the  case  of  k.  Let  us  first  bring  forward  three  basic  equations. 
Since  fa(k-ij-i)  S  Js{k- tbe  following  equation  holds  by  Lemma  1 

/<r(*-W-i)e(7')  ^  f s(y.-ij-\)e{T’y  (2) 

By  induction  hypothesis  on  c(k  —  1  ,j)  such  that  fc(k-\j)  <  fs(k-ijy  we  have 
lf  then  /p(i-i,j-i)©(T/)  <  /s(*-U)-  (3) 

From  Equation  2,  we  have 

*/  fe(k-ij)  <  fo(k-\j-\)e{T‘y  ihen  fc(k-xj)  <  (4) 

From  Lemma  2,  we  know  either  uj.=  or  Uk=  The  two  cases  are  discussed 

below. 

Case  I:  u*=  Uk- i+l.  We  first  discuss  the  situation  when  j  —  1,2,. . .  —  1.  We  know 

that  a  feasible  sequence  S(k,j)  is  either  in  the  form  of  S(k  —  1,  j)  or  S(k  —  1,  j  —  1)  ©  (T/). 
If  fc-(k-ij-7)B{T’)  <  Mk- ij),  then  <  fs(k-ij)  by  Equation  3.  Also  we  have 

/<T(*-u-i)e(r')  S  Js{k-ij-i)e{Tl)  by  Equation  2.  This  means  /<r(*_1>J-_1)e{T;)  <  fS(kj)  for  any 
feasible  sequence  S(k,j).  Consequently,  cr(kj)  =  u{k  -  l,j  -  1)  ©  (T/),  which  justifies  the 
theorem.  On  the  other  hand,  if  /cfi-ij-:)©^)  >  /p(t-ij).  then  fc{k-ij)  <  Jsf.*— : j-a)s(7rJ) 
by  Equation  4.  In  addition,  fa(k-ij)  <  fs(k-ij)  by  induction  hypothesis  on  o(lc  —  1,  j).  So 
fc{k-ij)  <  fs(kj)  for  any  feasible  sequences  5(i,j).  In  this  case,  a{k,j )  =  c(A:  —  1J),  which 
justifies  the  theorem. 

Then  we  discuss  the  situation  when  j  =  ti*.  Since  u^=  Uk_i4l,  it  is  clear  that  (T* ) 
belongs  to  c(fc):  otherwise,  we  need  to  pick  up  tasks  from  Ik- 1  to  make  a  feasible 

sequence,  which  violates  the  definition  of  .  Therefore,  cr(k,j)  can  be  expressed  as  S{k  —  1, 
u*-i) B  (Tl)  by  picking  up  a  proper  sequence  S(k-1,  u*_i).  Note  that  ujt_i=  j- 1  here.  By- 
Equation  2,  we  have/^i.^jg^  <  fs{k-ij-i)e(T>),  for  *ay  sequence  S(fc-l,j-l)©{T/). 
Thus,  c(&,y)  =  cr(fc  —  —  I)  ©  (Tfc).  Now 'let  us  check  the  theorem.  The  sequence 

c{k  —  l,i)  =  a(fc  —  1,  TXfc_i+l)  is  not  feasible;  thus  its  finish  time  is  oo.  The  condition 
fe{k-ij-i)B{Ti)  <  Je(k-ij)  is  satisfied.  So  cr(J:, j)  =  o():  -  1 ,  j  —  1)  ©  (T*).  This  justifies  the 
theorem. 

Case  II:  uj=  Uj_i .  The  reasoning  follows  the  discussion  of  the  first  part  in  Case  I.  □ 
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3  Scheduling  a  Task  Set 


In  this  section  we  discuss  how  to  schedule  a  task  set  by  using  Sequence-Scheduler.  The 
opiimol  schedule,  p,  of  a  task  set  is  defined  as  follows:  for  any  feasible  sequence  S  consisting 
of  tasks  in  T,  we  have  either 

\s\  <  IpI, 


or 


|S|  =  |i|  and  fs  >  f„. 

For  simplicity,  we  use  optimal  schedule  to  represent  the  optimal  schedule  of  the  task 
set,  when  there  is  no  confusion.  Note  that  the  optimal  schedule  of  the  task  set  is  the  best 
one  of  the  optimal  schedules  of  all  instances  in  the  task  set.  Erschler  ei  al  [2]  proposed  the 
dominance  concept  to  reduce  the  number  of  permutations  that  should  be  examined  for  the 
feasibility  test  of  a  task  set.  Yuan  and  Agrawala  [18]  proposed  the  super  sequence  to  further 
reduce  the  search  space  for  testing  the  feasibility  of  a  task  set.  In  this  section,  we  show  that 
for  our  optimization  problem,  the  super  sequence  provides  a  valid  amd  pruned  search  space. 
In  other  words,  there  exists  one  optimal  schedule  which  conforms  to  an  instance  in  the  super 
sequence  of  the  task  set.  Thus  we  may  use  Sequence-Scheduler  to  schedule  for  the  instances 
extracted  from  the  super  sequence  to  derive  the  optimal  schedule.  There  may  exist  more 
than  one  optimal  schedule  for  a  task  set.  Our  interest  is  on  how  to  derive  one  of  them. 

3.1  Super  Sequence 

Temporal  relations  between  two  tasks  and  Tj  are  summarized  in  the  following.  They  are 
illustrated  by  Figure  3. 

•  leading  :  T,-  -<  Tj,  if  r,-  <  'r,-,  d;  <  dj,  but  both  of  the  equalities  do  not  hold  at  the  same 
time. 

•  matching  :  T  j|  Tj,  if  rt-  =  rj,  d,-  =  dj. 

•  containing  :  Ti  U  Tj,  if  r,-  <  rj,  d,  >  dj. 
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(ii)  (in) 

Figure  3:  (*)  T<  ■<  Tj\  (ii)  T{  ||  Tf,  (Hi)  T{  U  T: 


A  task  h  is  called  a  iop  task  if  there  is  no  task  contained  by  h.  A  task  is  called  a  nontop 
task  if  it  contains  at  least  one  task.  Assume  that  we  have  t  top  tasks  in  the  task  set,  denoted 
by  . . .  .ht  respectively.  Denote  by  My  the  set  of  tasks  that  contain  the  top  task  hy, 

including  hy,  and  by  My  the  set  of  tasks  in  the  task  set  T  that  do  not  belong  to  My.  We  say 
that  Ti  is  weakly  leading  to  Tj,  denoted  by  Ti  <3  Tj,  if  Ti  <  Tj  or  T,  ||  Tj.  If  Ii  <3  Tj  fox  all  Tj 
belonging  to  S,  then  T,-  <  5. 

The  dominance  concept  is  originally  developed  by  Erschlex  et  al  [2]  to  reduce  the  search 
space  for  testing  the  feasibility  of  a  task  set.  The  idea  is  extended  with  the  super  sequence 
proposed  by  "Yuan  and  Agrawala  [18].  An  instance  I  dominates  an  instance  I'  iff: 

V  feasible  =?•  I  feasible. 

It  can  be  considered  that  1  is  a  better  candidate  as  a  feasible  schedule  than  J7.  A 
dominant  instance  is  an  instance  such  that  for  each  possible  instance  1  of  the  task  set,  if 
1  dominates  the  dominant  instance,  then  the  dominant  instance  dominates  J.  Thus  the 
dominant  instance  can  be  considered  as  the  best  candidate  of  the  feasible  schedule.  A  set  of 
instances  is  said  to  be  a  dominant  set,  if  J  does  not  belong  to  the  dominant  set,  then  there 
exists  a  dominant  instance  in  the  dominant  set  such  that  the  dominant  instance  dominates 
J. 
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A  super  sequence  A  serves  similarly  as  a  dominant  set  in  that  there  exists  a  dominant 
instance  in  the  super  sequence;  and  it  is  more  appropriate  for  solving  our  problem.  A  super 
sequence  is  a  sequence  of  tasks,  where  duplicates  of  tasks  are  allowed.  The  purpose  is  to 
extract  instances  from  the  super  sequence  for  scheduling.  The  super  sequence  is  constructed 
according  to  the  dominant  rules  (2, 18]  described  below.  Whenever  a  task  satisfies  one  of  the 
conditions  specified  by  the  rules,  a  duplicate  of  the  task  is  inserted  into  the  super  sequence. 
Note  that  duplicates  can  only  be  generated  for  nontop  tasks.  The  top  tasks  appear  once  and 
only  once  in  the  super  sequence. 

Rule  Rl:  Let  Ta  and  Tp  be  any  two  top  tasks.  If  T0  -<  Tp,  then  Ta  is  positioned  before  Tp. 
If  T0  ||  Tp,  the  order  of  the  two  top  tasks  is  determined  arbitrarily. 

A  unique  order  of  the  top  tasks  can  be  thus  determined  for  the  super  sequence.  Let  us 
denote  the  sequence  composed  of  the  top  tasks  by  H  =  (Aj,  A2, . . . ,  hi).  The  rule  implies 
that  if  T„  is  positioned  before  Tp  in  the  super  sequence,  then  Ta  <3  Tp.  So  A3  <  A  2  <3  •  •  •  <3  ht. 

Rule  R2: 

(1)  A  nontop  task  can  be  positioned  before  the  first  top  task  A3  only  when  it  contains  A3. 

(2)  A  nontop  task  can  be  positioned  after  the  last  top  task  ht  onl}7  when  it  contains  Aj. 

(3)  A  nontop  task  can  be  positioned  between  A*  and  hk~i  only  when  it  contains  Ajt  or  A«j.i. 

The  t  top  tasks  delimit  the  super  sequence  into  t  ■+  1  regions  by  rule  Rl.  Now  we  have 
i  i  1  subsets  of  nontop  tasks  separated  by  the  t  top  tasks  by  rule  R2.  Generally  speaking, 
a  nontop  task  has  more  than  one  possible  location.  Denote  the  Ath  subset  by  Ak,  which  is 
between  top  tasks  A*  and  A*+3.  From  rule  R2,  it  can  be  deduced  that 

Ak  =  Bkjzj  U  Bk<k+y  U  BkMi,  where  (5) 

Bkj+ 1  =  Mk  n  Mk+ 1,  Bk,k+i  =  D  Mk+ 1,  Bk'k+i  =  Mk  n  Mk+ 1- 

Next  rule  is  to  specify  the  order  of  the  tasks  within  each  subset. 


225 


Rule  R3:  In  each  subset  Ak,  for  k  =  0, 1, . . .  ,n, 

(1)  the  tasks  in  Bkj^  are  ordered  according  to  their  deadlines,  and  tasks  with  the  same 
deadlines  are  ordered  arbitrarily, 

(2)  the  tasks  in  Bjc.hi  are  ordered  arbitrarily, 

(3)  the  tasks  in  Bj_kJ_^  are  ordered  according  to  their  ready  times,  and  tasks  with  the  same 
ready  times  are  ordered  arbitrarily, 

(4)  the  tasks  in  Bk  are  positioned  before  those  in  ,  which  in  turn  are  positioned 

before  those  in  Bkk+i . 

Now  we  are  ready  to  construct  the  super  sequence  with  these  three  rules.  Top  tasks  are 
first  picked  out  and  ordered,  forming  t  ■+  1  regions.  In  each  region,  there  is  a  subsequence 
of  nontop  tasks.  An  instance  extracted  out  of  the  super  sequence  is  one  that  conforms  to 
the  super  sequence  without  duplication  of  tasks.  Let  q  be  the  number  of  top  tasks  that  a 
nontop  task  contains.  The  number  of  possible  regions  the  nontop  task  can  fall  into  is  q  +  1. 
The  number  of  instances  in  the  super  sequence  thus  sums  up  to 

K  =  Sfo+l)-, 

e=l 

where  nf  is  the  number  of  nontop  tasks  which  contains  q  top  tasks.  Compared  with 
an  exhaustive  search  which  takes  up  to  n!  instances  (permutations)  into  account,  the  super 
sequence  generally  leads  to  a  smaller  set.  Notice  that  it  takes  0(n )  time  to  check  if  an 
instance  is  feasible.  Hence,  the  time  complexity  of  the  feasibility  test  for  the  task  set  is 
0{N  *  n). 

3.2  Leading  Theorem 

The  super  sequence  is  not  only  useful  in  testing  the  feasibility  of  a  task  set;  we  will  show  that 
it  is  also  useful  in  reducing  the  number  of  instances  to  be  examined  in  order  to  obtain  the 
optimal  schedule  of  a  task  set.  We  will  show  that  there  exists  at  least  one  optimal  schedule 
which  conforms  to  an  instance  in  the  super  sequence  A.  Hence,  it  suffices  to  check  through 
A  to  obtain  the  optimal  schedule  of  T. 
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It  is  worth  attention  that  the  top  tasks  in  p  may  not  be  the  same  top  tasks  of  I\  This 
arises  because  some  of  the  top  tasks  of  T  may  be  rejected,  introducing  new  top  tasks  in 
p.  Before  proceeding  to  verify  the  rules  for  the  super  sequence,  we  will  first  introduce  the 
Leading  Theorem.  It  serves  as  the  base  for  further  analysis  in  the  Dominance  Theorem  and 
Conformation  Theorem  to  be  described  later.  The  Leading  Theorem  tells  that  -under  certain 
condition  we  can  adjust  the  order  of  tasks  to  satisfy  the  Weakly  Leading  Condition  to  be 
defined  below  and  do  not  introduce  a  schedule  with  greater  finish  time. 

Assume  that  S  is  a  feasible  sequence,  with  X^e,X ,  and  Lpo st  subsequences  of  S  such  that 

S  —  Lprc  ©  X  ©  Lpos1 . 

Let  us  denote  L  by 
L=(Te,TXl,TX3i...,TSu,Ta), 

where  u  >  0.  A  frame  F  is  defined  to  be  a  time  interval  characterized  by  a  beginning 
time  bp,  and  an  ending  time  ep.  We  say  that  F  is  a  frame  corresponding  to  X,  if  bp  =  sp, 
and  ep  =  /e,  where  sp  is  the  starting  time  of  Tp,  and  fa  is  the  finish  time  of  Ta. 

Theorem  2  (Leading  Theorem)  Assume  that  5  =  ®L®Lpost  is  a  feasible  sequence, 
where  L  =  {Tp , TXz , Tx„ . . . , 7h_,, Ta).  Let  F  be  a  frame  corresponding  to  L.  If  Ta  -<  Tjj,  and 
there  does  not  exist  a  task  TXi,  1  <  i  <  u,  such  that  F  U  T-,.,  then  there  exists  a  sequence  L 
which  is  a  permutation  of  L  such  that 
(i)  (Jo,  Tp)  conforms  to  X,  and 


Before  we  can  proceed  to  prove  the  theorem,  the  following  definition  is  useful. 

Weakly  Leading  Condition:  a  sequence  5  =  satisfies  Weakly  Leading 

Condition  if  Tf  <  Tf  <  . . .  <  T]f . 

Lemma  3  Let  5  be  a  sequence  satisfying  Weakly  Leading  Condition.  If  conforms 

to  S  and  7)  |]  T),  then  all  tasks  located  between  Ti  and  Tj  in  S  must  match  Ti  and  Tj. 
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PROOF:  For  any  task  T-  located  between  T;  and  Tj,  according  to  the  definition  of  Weakly 
Leading  Condition,  we  have  r,-  <  rx  <  rj  and  d,-  <  d-  <  dj.  Since  T;  j|  Tj,  rt-  =  rj  and 
d;  =  dj.  Therefore,  we  have  r,-  =  r  x  =  rj  and  d;  =  d-  =  dj.  So,  T-  matches  21-  and  2).  □ 

To  obtain  L,  let  us  modify  the  tasks  in  L  in  the  following  way.  If  the  ready  time  of  a 
task  is  less  than  bp ,  then  its  ready  time  is  set  to  bp.  If  the  deadline  of  a  task  is  greater  than 
tp,  then  its  deadline  is  set  to  tp.  The  computation  times  remain  unchanged.  Let  L’  be  a 
sequence  consisting  of  the  modified  tasks  with  the  same  order  of  L,  i.e., 

r/  _  irpr  rpl  rpl  rpl  rpl  \ 

Is  —  *•••*-*  s<(,9-40/* 

Since  Ta  -<  Tp,  dp  >  da  >  fa  =  tp.  So  d'0  =  d'Q  =  ep.  Also  ra  <  rp  <  sp  —  bp,  so 
T'a  —  T'p~  bp.  This  is  illustrated  in  Fig  4  (ii). 

Note  that  swapping  Tp  and  T'a  in  the  sequence  does  not  result  in  a  feasible  sequence  in 
this  example.  It  is  essential  that  we  adjust  the  order  of  the  tasks  located  between  them.  Let 
L'  be  a  sequence  which  is  a  permutation  of  L'  and  satisfies  the  Weakly  Leading  Condition, 
and  to  which  (T'D,T'p)  conforms.  Furthermore,  L'  satisfies  an  even  stronger  condition.  If  T^' 
is  positioned  before  Tf"  in  L' ,  then  TJp  <3  Tf'\  if,  furthermore,  T, P  ||  if the  corresponding 
tasks  T-r  and  Tj 1  satisfies  that  Tfr  <  Tj; .  The  idea  of  such  arrangement  is  that  when 
interchanging  Tp  and  T'0,  we  do  not  produce  a  new  reversed  pair  like  them.  By  reversed  pair 
we  mean  ior  example  Ta  -<  Tp  but  Tp  is  positioned  before  Ta  in  the  sequence.  So,  if  (Tj‘  ,Tj') 
conforms  to  L',  the  corresponding  tasks  satisfies  the  condition  that  either  Tj'  ~<  Tj'  or 
Tj  U  Tj  or  Tj  L!  Tj  .  One  possibility  of  L'  is  illustrated  in  Fig  4  (iii),  or 

i' = 

The  existence  of  such  a  sequence  is  proved  later.  Finally,  L  can  be  a  sequence  with 
the  same  order  of  L' ,  but  the  ready  times  and  deadlines  of  the  tasks  are  recovered  to  their 
original  settings.  This  is  illustrated  in  Fig  4  (iv).  The  figures  give  the  rough  idea  about  how 
the  adjustment  of  task  order  can  be  made  to  satisfy  the  conditions  described  in  the  Leading 
Theorem.  Here  below  is  the  proof  of  the  Leading  Theorem. 
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PROOF  (of  Leading  Theorem):  We  would  first  show  the  existence  of  L.  The  modification  of 
ready  times  and  deadlines  of  the  tasks  for  V  is  done  in  such  a  way  that  their  started  times 
are  not  affected.  In  addition,  their  computation  times  remain  the  same.  It  is  clear  that  L' 
is  feasible,  and 

f Lpr  «©■!/  flspr 

We  can  obtain  11  iD  the  following  way.  At  the  first  step,  the  first  task  Tp  of  11  is  the  task 
in  L‘  such  that,  for  any  task  Tx  belonging  to  11 ,  Tp  <1  Tx.  Such  a  task  Tp  exists  because  there 
are  no  containing  relations  among  the  modified  tasks,  and  ties  can  be  broken  arbitrarily.  Tp 
is  exchanged  with  the  task  located  just  left  to  it  in  11 .  Continue  the  exchanging  process 
until  Tp  occupies  the  first  location  in  the  sequence.  At  the  second  step,  the  second  task 
Tp  of  1!  is  the  task  in  11  such  that,  for  any  task  Ts  belonging  to  11  except  Tp,  Tp  <3  T-. 
Exchange  Tp  with  its  left  neighbor  task  consecutively  until  it  occupies  the  second  location 
in  the  sequence.  At  the  ith  step,  the  ith  task  Tp  of  11  is  the  task  in  L'  such  that,  for  any  task 
Ts  belonging  to  L'  except  Tp  through  Tpj ,  Ip  <3  Ts.  Exchange  Tp  with  its  left  neighbor 
task  consecutively  until  it  occupies  the  ith  location  in  the  sequence.  We  keep  performing  this 
operation  until  we  finally  obtain  L'.  Insertion  of  Tp,  1  <  i  <  |Z/j,  into  the  ith  position  of 
the  sequence  by  consecutive  swapping  is  possible  because  T?  <3  T-  for  all  T-  not  belonging  to 
(Tp  . . .  Tpj).  In  a  word,  the  adjustment  is  possible  because  there  are  no  containing  relations 
among  the  modified  tasks,  and  hence  there  exists  a  total  ordering  of  the  modified  tasks  by 
the  Weakly  Leading  Condition.  The  resultant  L'  is  existent  and  is  a  sequence  satisfying  the 
Weakly  Leading  Condition. 

There  is  a  chance  that  {Tp,  T0)  conforms  to  11.  By  Lemma  3,  all  tasks  located  between  Tp 
and  TQ  must  match  each  other.  Hence  the  order  of  these  tasks  does  not  make  any  difference. 
We  can  thus  exchange  the  position  of  Tp  and  T0,  which  makes  {Tp,  Ta)  conform  to  L' 

During  the  process  of  adjusting  the  position  of  Tp,  1  <  i  <  \L'\,  Tp  leads  to  or  matches 
any  task  in  the  sequence  except  (Tp  . . .  Tpa).  Thus  we  can  apply  Lemma  4,  to  be  described 
next,  which  assures  that  the  resultant  sequence  >_:„er  swapping  Tp  to  the  ith  location  comes 
with  a  shorter  or  equal  finish  time.  This  explains 

f LprtSl'  —  1 ' tpreSt'  f LprcQL- 
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X  is  a  sequence  with  the  same  order  of  X',  but  the  ready  times  and  deadlines  of  the 
tasks  axe  recovered  to  their  original  values.  Each  task  in  X  can  be  started  no  later  than  the 
starting  time  of  the  same  task  in  X'.  Consequently, 

iLprcBL  —  f t/prtffiL- 

By  Corollary  1,  we  have 

fLprc@l@Lpc„  ~  j ‘isprcQL&Lpeit  * 


□ 


Lemma  4  Assume  that  51  ©  52  ©  { Tj )  ©  53  is  feasible,  where  51, 52,  and  53  axe  sequences. 
If  Tj  <  52,  then  /sie<7J)eS2©S3  <  /si©S2©(r,)©S3- 

PROOF:  We  will  prove  the  theorem  by  induction  on  j52|.  When  |52|  =  0,  it  is  vacuously 
true.  Assume  that  it  is  true  when  |52j  =  k.  We  would  like  to  show  thai  it  is  true  when 
|52|  =  k  ©  1.  Let  52  =  (Ti)  ©  52',  where  152'!  =  fc;  i.e., 

51  ©  52  ©  { T: )  ©  53  =  51  ©  (T)  ©  52'  ©  (Tj)  ©  53. 

We  can  view  51  ©  (Ti)  as  a  single  sequence,  and  because  |S2.'|  —  A,  by  induction  hypotb- 
esis,  we  have 

/sie(7V>e{x,)es2'es3  5  /si©(r.)©S2'©(r,)©53- 
By  definition, 

/sig(T,)©;Tj)  =  ma.x(max(fs,Ti)  +  c,-,ry)  +  Cj 

=  max(fs  +  a  +  Cj ,  r,-  +  a  +  Cj ,  ry  +  c,) 

Since  T:  <  52,  which  indicates  that  Tj  <  Ti ,  we  have  r3-  <  r,-,  and  Tj  -f  Cj  <  r,-  -f  a  +  e,-. 
/si©(Ti)e(Tj)  =  rnax(fs  4-  c,-  cy-r,-  +  a  +  cy) 
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On  the  other  hand, 

/si©(T,)©<7.->  =  max{max(fs,Tj)  + Cj,ri)  +  a 

=  max(fs  +  c, •  +  Cj,  r j  +  a  -f  cy,  r<  +  c,) 

Because  r;-  -f  c,-  +  cj  <  r,-  -f  c;  +  Cj,  we  have 

f s\e(T,)e(Ti)  <  /sie(T,)e<T>)- 
By  Corollary  1, 

/sie(T,-)e(T.->eS2'eS3  —  /si©{r,>e(r,-)eS2'©S3- 
Therefore, 

/si6{X,)eS2©S3  <  f Sl&S2Q(Tj)QSZ- 

□ 


3.3  Dominance  Theorem 

The  super  sequence  is  constructed  for  the  feasibility  test  of  a  task  set.  If  a  task  set  is  feasible, 
we  say  that  there  exists  a  full  schedule  of  the  task  set.  There  may  exist  more  than  one  full 
schedule  for  a  given  task  set.  An  optimal  full  schedule  is  a  full  schedule  whose  finish  time  is 
shortest  among  all  the  possible  full  schedules.  Note  that  a  full  schedule  is  a  feasible  instance. 
In  this  section,  we  prove  that  if  a  task  set  is  feasible,  there  exists  an  optimal  full  schedule 
conforming  to  the  super  sequence  *  Hence,  the  super  sequence  provides  a  valid  and  pruned 
search  space  for  deriving  the  optimal  full  schedule  of  a  task  set. 

'In  {2),  Erschler  ci  cl.  ’s  theorem  implied  a  similar  result:  if  a  task  set  is  feasible,  there  exists  a  full  schedule 
in  the  dominant  set.  Our  theorem  further  shows  that  there  exists  such’  a  full  schedule,  with  the  minimum 
iinish  time  among  all  full  schedules,  that  conforms  to  the  super  sequence.  We  prove  the  existence  of  such  an 
optimal  full  schedule  in  a  more  systematic  way. 
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Theorem  3  Assume  that  the  task  set  T  is  feasible  and  p  is  an  optimal  full  schedule  of  I\ 
Let  Ta  and  Tp  be  two  top  tasks  of  p  such  that  Ta  -<  Tp.  If  (Tp,  Ta)  conforms  to  p,  then  there 
exists  another  optimal  full  schedule  p'  such  that  (Ta,Tp)  conforms  to  p'. 

PROOF:  Tq  and  Tp  are  two  top  tasks.  Let  F  be  a  frame  such  that  bp  =  sp  and  ep  =  fa. 
T0  •<  Tp  means  bp  =  sp  >  Tp  >  ra,  and  tp  =  fa  <  da.  If  there  exists  a  task  Ts  such 
that  F  U  Tx,  then  Ta  L!  Tx  too.  This  contradicts  to  the  fact  that  Ta  is  a  top  task.  Hence 
F  cannot  contain  any  task.  By  the  Leading  Theorem,  there  exists  another  sequence  p'  such 
that  (Ta,Tp)  conforms  to  p',  and  both  \p'\  =  |p|  and  /p-  <  f„  hold,  which  means  p'  is  an 
optimal  full  schedule  too.  □ 

When  two  tasks  match  each  other,  it  dose  not  matter  which  task  is  executed  first.  This 
gives  rise  to  the  following  Corollary. 

Corollary  2  Assume  that  the  task  set  T  is  feasible  and  p  is  an  optimal  full  schedule  of  T. 
Also  assume  that  (Alt . . . ,  Tp ,  Ta, . . . ,  A*),  the  subsequence  of  the  top  tasks  in  p,  conforms  to  p. 
If  T0  <  Tp,  then  there  exists  another  optimal  full  schedule  p'  such  that  (A*, . . . , Ta,  Tp, . . . ,  At) 
conforms  to  p'. 

PROOF:  Theorem  3  holds  when  Ta  <  Tp,  because  when  two  tasks  match  each  other,  the 
execution  order  of  the  two  tasks  is  arbitrary'.  Also  by  looking  at  the  adjustment  process  of 
Leading  Theorem,  we  can  find  that  the  tasks  located  before  and  after  Ta  and  T$  have  not 
been  adjusted.  This  verifies  the  corollary7.  D 


Corollary  3  Let  E  =  (Aj,  A2, . . . ,  At)  be  top  tasks  of  the  task  set  T  such  that  Aj  <3  A2  <3 
. . .  <  Aj .  If  T  is  feasible,  there  exists  an  optimal  full  schedule  p'  to  which  H  conforms. 

PROOF:  Since  T  is  feasible,  there  exists  an  optimal  full  schedule  p.  Let  K  =  (Aj,  A2, . . . ,  K) 
be  a  sequence  which  is  a  permutation  of  E  such  that  K  conforms  to  p.  We  would  like  to 
adjust  the  order  of  the  tasks  in  K  so  that  K  is  transformed  successively  into  H.  We  locate 
the  corresponding  task  of  A*  in  K,  where  x  is  chosen  in  the  order  of  1  through  t,  and  adjust 
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it  to  the  xth.  position  in  K  by  consecutively  swapping  hs  with  its  left  neighbor.  This  leads 
to  the  sequence  E.  During  the  swapping,  A -  always  weakly  leads  to  its  left  neighbor,  for 
Aj, . . . ,  A-_i  are  in  positions  1, ...  ,x  —  1.  By  Corollary  2,  there  always  exists  an  optimal  full 
schedule  to  which  the  intermediate  resultant  sequences  conform.  Therefore,  there  exists  an 
optimal  full  schedule  p'  to  which  (Aj,  A2, . . . ,  At)  conforms.  □ 

Given  an  optimal  full  schedule  p,  we  can  always  obtain  another  optimal  full  schedule  p' 
in  which  the  top  tasks  axe  ordered  according  to  the  weakly  leading  relations  by  Corollary  3. 
Therefore  the  rule  Rl  is  verified. 

Before  we  can  go  further,  the  following  definitions  are  useful.  Let  A*  be  a  top  task  and 
T-  a  nontop  task  of  a  sequence  S.  We  say  that  (A*.,  Tz)  is  a  disorder  pair  of  S  if  (hk,Tx) 
conforms  to  5  and  X-  -<  hk.  Similarly,  (Xx,  A*.)  is  a  disorder  pair  of  S  if  (X*,  At)  conforms  to 
5  and  kk  -<  X-.  The  disorder  degree  of  S  is  defined  to  be  the  number  of  disorder  pairs  in  S. 

Theorem  4  Assume  that  the  task  set  T  is  feasible  and  p  is  an  optimal  full  schedule  of  T. 
Let  Ai<]  A2<] . .  .<3  A,  be  top  tasks  and  T:  a  nontop  task  of  p.  Assume  that  p  =  LVTCQLBL?0Xt 
such  that 

(Aa, ....  A*_a)  conforms  to  Lyrt , 

{A^a,...,A.)  conforms  to  L?0 

We  have  the  following  properties: 

(1)  if  X-  •<  h.k  and  X  =  (A*,. .  .  .X-),  then  there  exists 
Ljrt  S  X  S  L?OS‘  such  that  X  is  a  permutation  of  X,  and 
disorder  degree  of  p’  is  less  than  that  of  p 

(2)  if  A*  -<  Tx  and  X  =  (X-,..., A*},  then  there  exists 
■Lpre  ©  X  ©  Lpcst  such  that  X  is  a  permutation  of  X,  and 
disorder  degree  of  p'  is  less  than  that  of  p 

PROOF:  We  will  prove  (1)  first.  Let  F  be  a  frame  with  bF  =  shk  and  ejr  =  /..  Since  Tx  -<  h.k , 
-r  —  fz  <  <  ikk.  Also  bF  —  Sh,k  >  rhk .  If  there  exists  a  task  XL,  located  between  hk  and 


another  optimal  full  schedule  p'  = 
(X-,  Afc)  conforms  to  X;  besides,  the 

another  optimal  full  schedule  p '  = 
(A k,Ts)  conforms  to  X;  besides,  the 


Ts  such  that  F  U  TWi,  then  hk  LI  TWi  too.  This  contradicts  to  the  fact  that  A*  is  a  top  task. 
The  condition  of  the  Leading  Theorem  is  satisfied.  Hence  there  exists  a  sequence  L  which 
is  a  permutation  of  L  such  that  { Tx,hk )  conforms  to  L  and  Il^&LbLpc,  —  /*•  Therefore, 
Lpre  ©  L  ©  Lpost  is  also  an  optimal  full  schedule.  Now  let  us  look  at  Figure  4(iv).  This  is 
the  schedule  after  the  adjustment  process  of  the  Leading  Theorem  is  made.  For  the  tasks 
whose  deadlines  are  less  than  cp,  they  all  lead  to  hk.  Note  that  the  disorder  is  a  relationship 
defined  between  a  nontop  task  and  a  top  task,  and  hk  is  the  only  top  task  in  the  frame 
F.  Therefore,  no  new  disorder  pairs  with  hk  are  introduced  among  these  tasks.  Similarly, 
for  the  tasks  whose  ready  times  are  greater  than  bp,  they  are  all  led  by  hk.  Therefore,  no 
new  disorder  pairs  are  introduced.  As  for  the  tasks  otherwise,  including  Ts  and  hk,  whose 
deadlines  are  greater  than  or  equal  to  tp  and  ready  times  less  than  or  equal  to  bp,  they  can 
be  ordered  arbitrarily.  Hence,  we  can  position  Ts  before  hk,  and  remove  the  disorder  pairs, 
if  anj7,  in  these  tasks  by  rearranging  the  proper  orders  for  them.  Thus  the  disorder  degree  of 
L  is  decremented  by  at  least  one.  So  the  disorder  degree  of  p'  is  less  than  that  of  p.  Property 
(2)  holds  for  the  same  reason.  D 

Note  that  T-  does  not  match  hk  or  otherwise  T-  is  also  a  top  task,  which  contradicts 
our  assumption. 

Theorem  5  Assume  that  the  task  set  T  is  feasible  and  p  is  an  optimal  full  schedule  of  F. 
Let  h\  <  h2  <3  . .  -  <  h^  be  top  tasks  of  p.  There  exists  an  optimal  full  schedule  p'  such  that 
{hi,  h2,  ...,ht)  conforms  to  p',  and  for  any  nontop  task  Tx  such  that  {hie,  Afc+1)  conforms 
to  p',  either  T-  U  hk  or  Tx  U  hk+ 1 . 

PROOF:  Assume  that  Tx  is  a  nontop  task  such  that  (hk,T-,hk+i)  conforms  to  p'.  If  Tx  does 
not  contain  hk  and  Tx  does  not  contain  A*+i,  then  either  T~  -<  hk  or  hk+i  -<  Tx.  Hence, 
either  ( hk,Tz )  or  (Tx,hk+ 1)  is  a  disorder  pair.  We  can  eliminate  it  through  Theorem  4,  and 
the  disorder  degree  is  decremented  by  at  least  one.  "Whenever  there  is  a  disorder  pair  in  the 
schedule,  we  can  always  apply  Theorem  4  to  eliminate  it.  The  disorder  degree  is  decremented 
in  this  way  until  finally  reaching  zero.  Hence,  { hk,Tx ,  Ajc+i)  conforming  to  p'  implies ib at  Tx 
is  not  leading  to  hk  and  is  not  leading  to  Tx.  The  only  possibilities  are  either  Ts  U  hk 
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ox  T*  U  h+i . 


□ 


Theorem  5  confirms  the  validity  of  rules  Rl  and  R2. 

Theorem  6  (Dominance  Theorem)  If  a  task  set  T  is  feasible,  there  exists  an  optimal 
full  schedule  p  such  that  p  conforms  to  the  super  sequence  of  F. 

PROOF:  In  Theorem  5,  we  verify  the  existence  of  the  optimal  full  schedule  such  that  the 
top  tasks  are  ordered  according  to  their  weakly  leading  relations,  and  the  nontop  tasks 
are  located  in  the  appropriate  subsets  between  top  tasks.  The  only  work  left  is  to  order 
the  nontop  tasks  in  each  subset.  The  adjustment  process  of  the  Leading  Theorem  can  be 
applied,  and  the  resultant  order  is  exactly  specified  by  rule  R3.  So  we  can  conclude  that 
there  exists  an  optimal  full  schedule  p  which  conforms  to  the  super  sequence.  □ 

3.4  Conformation  Theorem 

If  there  is  no  task  rejected  in  p,  there  exists  an  optimal  full  schedule  conforming  to  the 
super  sequence  of  T.  However,  if  T  is  not  feasible,  some  tasks  in  T  should  be  rejected.  The 
dominant  rules  axe  developed  based  on  the  assumption  that  no  task  is  rejected.  When  tasks 
axe  allowed  to  be  rejected,  the  situation  is  different.  The  issue  to  be  raised  is  whether  the 
decent  solution  for  feasibility  test  can  be  applied  to  our  optimization  problem.  Remember 
that  by  optimization  we  mean  that  the  number  of  rejected  tasks  in  the  schedule  is  minimized 
and  then  the  finish  time  of  the  schedule  is  also  minimized.  When  a  task  set  is  feasible,  the 
optimal  schedule  is  also  the  optimal  full  schedule.  The  difficulties  axe  addressed  in  the  next 
section,  followed  by  the  approach  and  proof  to  solving  the  difficulties. 

3.4.1.  Difficulties 

We  wish  to  make  use  of  the  super  sequence  as  search  space  in  our  scheduling  problem.  The 
difficulties  are  twofold. 

First,  when  a  task  is  allowed  to  be  rejected,  the  dominant  rules  specifying  the  relations 
among  containing  tasks  and  contained  tasks  need  to  be  modified,  because  the  rules  are 
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Figure  5:  The  optimal  schedule  may  not  conform  to  the  super  sequence. 

developed  based  on  the  assumption  that  no  task  is  to  be  rejected.  The  new  rules  can 
become  quite  complicated.  Let’s  look  at  the  example  depicted  in  Figure  5.  Assume  that  the 
task  set  is 

T  =  {T,S2,Ts,T4,Ts}, 

and  the  super  sequence  of  the  task  set  is 

a  =  (ri,r2,r3,T4,T2,r3,r1,Ts). 

The  top  tasks  are  typed  in  bold  letters  for  emphasis.  F  is  not  feasible.  We  can  see  that 
one  possibility  of  the  optimal  schedule  could  be 

Po  =  (T2,  T\.  7-j,  T$). 

Apparently,  po  does  not  conform  to  A.  One  may  be  able  to  show  that  another  optimal 
schedule  (T2,T.t, T^Ts)  conforms  to  A.  Houcver;  given  an  arbitrary  task  set,  it  is  not 
guaranteed  that  one  is  always  able  to  do  so.  In  the  example,  TA  is  rejected.  If  we  recompute 
the  super  sequence  without  T4,  we  would  get  a  different  super  sequence.  The  new  super 
sequence  would  be 

Ao  =  (Ti,T2,Ti,  T3,!Ti,  T5), 

to  which  po  conforms.  This  gives  a  great  difficulty.  It  seems  that  -we  need  to  check 
against  each  task.  Construct  a  super  sequence  in  condition  that  the  task  is  accepter,  and 
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construct  another  super  sequence  in  condition  that  the  task  is  rejected.  In  general,  we  need 
to  construct  2~  super  sequences  in  this  way.  This  is  too  formidable  to  schedule,  considering 
that  the  number  of  instances  in  each  individual  super  sequence  can  be  exponential  to  the 
size  of  the  task  set. 

Secondly,  while  rejecting  a  nontop  tasks  does  not  affect  much,  rejecting  a  top  task  could 
affect  the  duplication  and  positions  of  the  nontop  tasks  or  might  even  result  in  some  new 
top  tasks.  Thus,  the  super  sequences  can  be  totally  dinerent..  Look  at  tne  same  example  in 
Figure  5.  The  rejection  of  X!<  results  in  two  more  top  tasks.  i.e.,  1?  and  T3.  This  makes  Ao 
completely  dinerent  from  A. 

We  propose  a  swapping  and  replacing  procedure  to  overcome  the  difficulties.  The  proce¬ 
dure  would  be  described  and  verified  m  the  loiiowmg  section. 


3.4.2  Aon  roach  and  Proof 

The  idea  of  our  approach  is  stated  briefly  below,  rollowed  a  lormal  prooi.  Let  T  and  A  be  the 
original  task  set.  and"  the  super  sequence  or  the  task  set  respectively.  It  is  clear  that  there 
exists  an  optimal  schedule,  which  is  unknown  to  us,  lor  any  task  set.  Let  i  0  be  the  task' 
set  which  is  composed  of  the  tasks  of  the  unknown  optimal  schedule,  and  Ao  be  the  super 
sequence  of  F0.  F0  and  A0  are  also  unknown  to  us.  As  mentioned  above,  A0  might  be  quite 
dinerent  from  A.  Notice  that  the  unknown  optimal  schedule  of  T  is  also  an  optimal  (full) 
schedule  of  F0-  Since  F0  is  feasible,  by  the  Dominance  Theorem,  there  exists  an  optimal  full 
schedule  for  Fo,  say  p0.  such  that  p0  conforms  to  Ao-  Our  problem  is  that  we  are  not  able  to 
compute  po  from  Ao,  because  Ao  is  unknown.  We  are  able  to  compute  A  from  F  by  applying 
the  dominant  rules.  The  swapping  and  replacing  procedure  exploits  the  way  to  adjust  the 
order  of  tasks  in  p0  and  to  replace  some  tasks  if  necessary,  so  as  to  transform  po  into  a  new 
schedule  p  such  that  p  is  also  an  optimal  schedule  and  best  of  all  p  conforms  to  an  instance 


of  A.  For  the  sake  of  simplicity,  we  will  say  a  schedule  conforms  to  A,  when  the  schedule 
conforms  to  an  instance  of  A.  So  we  can  use  A  as  a  valid  search  space  when  scheduling  F. 
In  the  example  of  Figure  5,  we  Transform  p0  into 

r=(T7tT<tTs.Ts). 
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This  example  is  so  simplified  that  the  existence  of  p  can  be  verified  by  mere  intuition. 
However,  the  reasoning  is  far  more  complicated  than  it  appears  at  the  first  glance.  We  are 
going  to  prove  in  the  following  theorem  that  such  an  optimal  schedule  p  that  conforms  to  A 
always  exists.  The  corresponding  lemmas  are  presen  .  ;i  in  the  next  section. 

Theorem  7  (Conformation  Theorem)  Given  a  task  set  T  =  {Xi,Xj,..., Xn},  there  exists 
an  optimal  schedule  p  such  that  p  conforms  to  the  super  sequence  A  of  F. 

PROOF:  Given  any  task  set  F,  there  exists  at  least  one  optimal  schedule,  which  is  unknown 
to  us.  Assume  that  we  need  to  reject  w  tasks  from  F  to  make  a  feasible  schedule.  Let  Fo  be 
the  task  set  which  is  composed  of  the  tasks  in  the  unknown  optimal  schedule.  To  is  a  subset 
of  F.  The  super  sequence  of  To  is  denoted  by  A0.  In  addition,  we  use  Tj,  0  <  j  <  to,  to 
represent  a  task  set  derived  by  adding  j  tasks  into  Fo,  A;-  the  super  sequence  of  Fj,  and  pj 
an  optimal  schedule  of  Fj.  When  we  say  adding  j  tasks  into  Fo,  we  mean  that  the  resultant 
task  set  Fj  is  composed  of  distinct  tasks  and  Fj  is  a  subset  of  T.  In  particular,  F^  is  F. 
We  will  prove  by  induction  on  w  to  show  that  there  exists  an  optimal  schedule  pw  for  F 
conforming  to  Aw. 

Base  step  w  =  0:  there  is  no  task  rejected.  F  =  To-  Since  T  is  feasible,  by  the  Dominance 
Theorem,  there  exists  an  optimal  (full)  schedule  po  lor  T  such  that  po  conforms  to  A0. 

Induction  hypothesis:  assume  that  the  theorem  holds  when  w  s=  j,  i.e.,  |po!  =  n—  j.  For 
the  task  set  Tj  which  is  derived  by  adding  j  tasks  into  To,  there  exists  an  optima]  scheduie 
Pj  for  r  such  that  pj  conforms  to  A  j.  Notice  that  |pj]  =  |p0j,  and  |Fj|  =  fTol  +  j- 

Now  consider  the  case  when  w  =  j  +  1,  i.e.,  jpo|  =  n  —  (j  - j- 1).  We  need  to  reject  j  - r  1 
tasks  to  make  a  feasible  schedule.  There  exists  an  optima]  schedule  pj  for  T  conforming  to 
A  j  by  induction  hypothesis.  We  want  to  show  that,  by  swapping  and  replacing  the  tasks 
in  pj.  the  resultant  sequence  pJ+1  conforms  to  Aj+J;  besides,  |pJ+1]  =  Jpy J,  and  <  fPj, 
which  implies  that  pJ+i  is  also  an  optimal  schedule  for'  T.  Let  T.  be  the  task  added  into  Tj 
to  make  T .  So,  Tj  U  {X-}  =  r,-+ 1.  There  are  two  possibilities  when  adding  Ts. 

If  X=  is  a  nontop  task  of  Tj+i,  adding  X-  does  not  add  a  top  task  into  Fj.  The  orders 
of  the  top  tasks  in  both  A j+3  and  A j  derived  through  ruie  III  axe  exactly  the  same.  Rule 
R2  specifies  the  relation  between  a  nontop  task  and  a  top  task.  Adding  a  nontop  task  X- 
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does  not  affect  the  relations  between  the  already  existent  nontop  tasks  and  top  tasks.  The 
positions  (duplicates)  of  the  already  existent  nontop  tasks  in  Aj  are  preserved  in  A y+3 .  Rude 
R3  specifies  how  to  arrange  the  order  of  the  nontop  tasks  within  each  subset.  Again  adding 
a  nontop  task  Tx  does  not  alter  the  orders  of  the  already  existent  nontop  tasks  in  each  subset 
in  Aj.  Therefore,  if  the  task  being  added  is  a  nontop  task,  A j  is  a  subsequence  of  Aj+1.  Let 
us  look  at  the  example  in  Figure  5.  Assume  that  Tj  and  Tj+i  are 

T;  =  {r2,T3,r4,r5}, 

Ij+i  =  {Ty,T7,Tz,TA,Ts}, 

where  Ti  is  a  nontop  task.  The  corresponding  super  sequences  would  be 
Aj  =  (T2,T3,T4.T2,T3,T5),  and 
A;+i  =  (Ty,T^T^Tz,TuTz). 

We  can  see  in  the  example  how  A  j  conforms  to  Aj+i . 

Otherwise,  Tz  is  a  top  task  of  Tj+i .  T-  does  not  contain  other  tasks  in  Two 

situations  are  possible. 

(i)  T=  is  not  contained  by  other  tasks.  The  number  of  top  tasks  in  r,-jo  is  one  more  than 
that  of  the  top  tasks  in  Tj.  The  order  of  the  top  tasks  in  Aj  is  preserved  in  Aj+i,  since 
the  relations  of  the  top  tasks  are  not  altered  by  adding  Tx.  Furthermore,  T-  does  not  alter 
any  existent  orders  among  the  nontop  tasks  and  top  tasks,  or  among  the  orders  between  the 
nontop  tasks  and  non  top  tasks,  specified  by  rules  R 2  and  R3,  respectively.  Therefore,  A,  is 
a  subsequence  of  A ,-+J.  Let  us  look  at  the  example  in  Figure  6.  Assume  that  Tj  and 
are 


Tj  =  {Tl5T2,r4,T5},  and 

Fi+i  =  {ra,r2,T3,r4,r5}, 

where  T3  is  a  top  task  not  contained  by  other  tasks.  The  corresponding  super  sequences 
would  be 


A  j  =  (T1,T2!Ti,T4,T5,T4),  and 
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Figure  6:  The  added  top  task  T3  is  not  contained  by  other  tasks. 


A?'+i  =  (Ti,T2,Ti,T3,!T4,T5,r4). 

We  can  see  in  the  example  how  A  j  conforms  to  AJ+i . 

(ii)  Ts  is  contained  by  some  top  tasks  and/or  nontop  tasks  of  Tj.  Let  the  top  tasks  of 
Tj  containing  Tx  be  51,  • . .  ,gm,  indexed  in  the  weakly  leading  order.  This  situation  is  more 
complicated,  because  i  =  1,. .  .,m,  turn  out  to  be  nontop  tasks  in  Fj+i-  There  exists  a 
total  ordering  of  them  by  weakly  leading  relations,  because  there  is  no  containing  relations 
among  p,-.  By  rule  Rl,  the  super  sequence  of  Tj+ 1  can  be  expressed  as 

Aj^-i  —  (•  •  •  j  hk—1 )  •  •  •  1  Pi  5  •  •  •  5  Pm 5  •  •  •  ■  »Pl)  •  '  '  )  Pn»l  •  •  *  j  hk+l  i  •  •  •)» 

■where  hi, . . . ,  hk -\,hk,  hk+i, . . .  are  the  top  tasks  in  ri+i  ,  and  in  particular,  hk  represents 
Tx.  By  rule  R3,  the  super  sequence  of  Tj+i  can  also  be  expressed  as 

A?+l  =  {••*•.  hk—1  •  &k- l.X’  Bk—lJf'  Bk—l,k'-  Bk.k-rl  1  Bk,k+h  t  '^k-rlt  •  •  ■)(  ^.6) 

V _  ...  _  _  .  _  _ _  ... _ ^  * 


where  represents  the  subsequence  of  A, 41  between  h*_3  and  excluding  hk-i 
and  hk+i,  as  depicted  above,  and  where  ©  means  concatenation 

of  sequences.  Remember  that  gi,...,gm  are  top  tasks  of  Tj.  All  the  top  tasks  in  Tj  are  in 
the  order  of  hi,...,  fc*-i,Pi, .  ..,gm,  hk+i, ...  by  the  weakly  leading  relations.  By  rule-  Rl, 
the  super  sequence  of  Tj  can  be  expressed  as 

A  j  (•  -  •  ?  hk- 1 , ...,  g\, ... ,  g^n , . . .,  hk+i (7) 

™  ’ 
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where  fly  represents  the  subsequence  of  A j  between  hk-y  and  hi+1,  excluding  hk-y  and 
hk+ 1,  as  depicted  above.  Notice  that  in  Equations  6  and  7,  the  subsequences  before  hk-i 
of  both  A;+i  and  A  j  axe  exactly  the  same,  because  the  addition  of  the  top  task  h*,  or  Ts, 
affects  only  the  subsequence  between  hk-y  and  hk+y .  Similarly,  the  subsequences  after  hk+y 
of  both  Aj+i  and  A;-  axe  exactly  the  same  too.  Hence  an  instance  of  Aj+y  will  differ  from 
an  instance  of  A  y  only  in  the  the  subsequences  of  Hy+i  and  Hy. 

Now  we  would  like  to  check  what  tasks  in  fly  should  follow  immediately  after  hk-y.  By 
Lemma  7,  all  the  tasks  in  fly  can  be  found  in  fly+i  —  hi.  So  we  only  need  to  check  the  tasks 
in  CLj+i  —  hk.  If  a  task  contains  hk-y  but  not  hi,  the  task  must  not  contain  gy.  Because 
Pi  contains  hi,  any  task  which  contains  gy  should  also  contain  hi.  When  constructing  the 
subsequence  of  A  y  between  hk-y  and  gy,  by  rule  R2,  all  the  tasks  which  appear  in  Bk_kj  of 
A j+y  should  follow  immediately  after  hk-y]  and  by  rule  R3,  the  order  of  the  subsequence  is 
exact}}-  the  same  as  Bk_k$. 

One  may  observe  that  some  tasks  in  Bk-y,k  contain  hk-y  but  do  not  contain  pi,  so  they 
would  also  be  positioned  between  hk-y  and  gy.  These  tasks  would  follow  after  the  tasks  of 
B*-yJ-  Lhis  is  because  they  do  contain  hi  and  hence  have  greater  deadlines  than  those 
tasks  in  For  the  same  reason,  all  the  tasks  which  appear  in  Bkk +y  of  Aj+j  should  be 

located  immediately  before  h>+1  when  constructing  A  y,  and  the  order  is  the  same.  Hence, 
the  A  j  can  be  further  expressed  as 


A-  —  hi— i,  Bk—^£* . . . ,  gi, . . . ,  gm ,  •  ■  ■ j hk+y. . . .), 


O' 


(S) 


where  fl'-  represents  the  subsequence  between  Bk_xj  and  A+1  of  A  y.  excluding  Bk_xj 
and  Bj'kJrl.  We  have  fly  =  Bk_lj  &  ft'-  &  By  Lemma  9,  all  the  tasks  in  fly  are  either 

in  Bk-yj:  or  in  Bj^.k- 

Let  us  look  at  an  example  in  Figure  7.  The  task  set  in  the  figure  is  Ty+i .  And  Ty+i  —  hi 
would  be  Tj.  gy  and  gy  contain  only  hi.  So  p:  and  gy  are  classified  as  nontop  tasks  in  rwr, 


F 


i 


» 
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Figure  7:  The  added  top  task  h*  is  contained  by  other  tasks. 


and  as  top  tasks  in  Tj.  We  can  compute  the  super  sequences  as  follows. 


Aj>i  =  (Ti,T2,Tz,T4,h^i,  Ta^  ,2;,  T3,  7\,  g\  h^,  T?,  9\  ,97 ,  7j^,  T$,  ,h^+1,Ti,Ts), 

•Bk-*43  %,*+ j 


£*-i.*  &k-i,k  %rr,A 


n 


>4J 


n 


J+3 


=  (21i»r2,73, T4. h^-i,  T<  ,T2,T3.!ri,gi,T3. Ti,g2,T3,Ti,  I5  .h^j.JijTj). 


n; 


■®5.k+i 


o> 


Now  going  back  to  examine  Equations  6  and  8,  we  can  find  that  A,  and  A j+i  only  differ  in 
the  middle  subsequences  represented  by  Q'  and  flj+3 .  This  can  also  be  seen  in  the  example  in 
Figure  7.  The  instances  extracted  from  A;-  would  conform  to  Aj+i  except  the  corresponding 
middle  subsequence  mentioned  above.  Remember  that  pj  conforms  to  Aj.  We  would  try  to 
adjust  the  order  of  the  tasks  of  the  subsequence  in  pj  which  correspond  to  fi'-  for  the  purpose 
that  the  resultant  schedule  pj+ j  conforms  to  A^+i  and  pj+ 3  is  also  an  optimal  schedule  of 
T.  The  adjustment  procedure,  called  the  swapping  and  replacing  method,  applied  to  p,  is 
described  below’: 
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Cl:  for  all  tasks  Ty  €  ft'-  such  that  /v  <  dhk,  they  are  sorted  by  their  ready  times  ry. 

C2:  fox  all  tasks  Tv  €  f !'•  such  that  sy  >  rkk,  they  are  sorted  by  their  deadlines  dy. 

C3:  a  task  can  be  sorted  by  Cl  or  C2  described  above  if  the  task  satisfies  both  conditions. 
C4:  if  there  exists  a  task  Ty  €  D'-  such  that  sy  <  and  /y  >  d^k,  Ty  is  replaced  by  h 

We  would  like  to  show  that  the  adjustment  does  make  pj+y  conform  to  Aj+j.  Remember 
that  A j  and  Aj+1  only  differ  in  the  middle  subsequences  represented  by  f !'•  and  .  We 
only  swap  and  replace  the  tasks  of  pj  located  in  to  derive  pj+\.  Since  pj  conforms  to 
A  j,  the  head  and  the  tail  of  pj+y  also  conforms  to  A  j+y.  So  we  only  need  to  check  the 
middle  subsequence  of  pJ+i  to  see  if  the  whole  sequence  of  pj+y  conforms  to  A j+y.  The 
tasks  adjusted  by  the  swapping  and  replacing  procedure  are  either  in  Bk-i,k  or  in 
by  Lemma  9.  Let  us  first  check  the  adjustment  of  Cl.  In  ,  the  order  of  the  tasks  in 
Bk-\,k  can  be  determined  arbitrarily  according  to  rule  R3,  so  it  does  not  matter  which  task 
is  located  before  which.  And  in  A ,41 ,  the  order  of  the  tasks  in  Bj~^k  is  determined  by  their 
ready  times.  During  the  adjustment  of  Cl,  all  the  qualifying  tasks  are  ordered  according  to 
their  ready  times.  We  know  that  the  ready  times  of  the  tasks  in  Bk-i,k  axe  less  than  the 
ready  times  of  the  tasks  in  Bjz^,k’  So,  in  the  resultant  schedule  pj+y,  the  tasks  of  Bk~y.,k 
are  positioned  before  the  tasks  of  This  indicates  that  the  order  of  these  tasks  in 

Pj-fi  after  the  adjustment  of  Cl  conforms  to  Aj+j.  For  the  same  reason,  the  adjustment  of 
C2  makes  the  order  of  the  swapped  tasks  conform  to  Aj+y.  In  condition  C4,  if  such  a  Tv 
exists,  replacing  Tv  by  hk  also  conforms  to  Aj+i,  which  be  seen  in  Equation  6.  Each 
task  Ty  £  fl'  satisfies  one  of  the  conditions  by  Lemma  11.  Hence,  all  the  tasks  in  the  middle 
subsequence  of  p^y  are  adjusted  in  such  a  way  that  the  order  conforms  to  A ,*+j.  So  p^y 
conforms  to  Aj+y. 

Now  we  wonld  like  to  show  that  Pj+i,  in  addition  to  conforming  to  Aj+i,  also  be 
fimsbed  no  later  than  pj.  We  can  view  the  tasks  satisfying  condition  Cl  as  having  the  same 
virtual  deadlines  of  dki ,  because  they  all  finish  before  this  time  instant.  Hence,  there  is  a 
total  ordering  among  the  tasks  with  virtual  deadlines  by  weakly  leading  relations,  which 
is  achieved  by  sorting  their  ready  times.  By  Lemma  10,  the  finish  time  of  the  resultant 
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schedule  after  the  adjustment  of  Cl  would  not  be  greater  than  that  of  the  original  schedule. 
Similarly,  the  tasks  satisfying  condition  C2  can  be  viewed  as  having  the  same  virtual  ready 
times  of  r^,  because  they  all  start  after  this  time  instant.  For  the  same  reason,  the  finish 
time  of  the  resultant  schedule  after  the  adjustment  of  C2  would  not  be  greater  than,  that  of 
the  original  schedule.  In  condition  C3,  the  qualifying  tasks  can  be  sorted  in  either  way  and 
does  not  affect  the  result.  In  condition  C4,  if  there  exists  a  task  Ty  whose  computation  time 
covers  the  whole  window  of  the  rejected  task  hk,  we  may  as  well  replace  Ty  by  hk,  and  the 
finish  time  of  the  resultant  schedule  after  the  adjustment  of  C4  would  not  be  greater  than 
that  of  the  original  schedule.  Each  task  Ty  €  fij  satisfies  one  of  the  conditions  by  Lemma  11. 
Hence,  all  the  tasks  in  the  middle  subsequence  of  pj+y  are  adjusted  in  such  a  way  that  pj+y 
would  be  finished  no  later  than  pj.  Therefore,  =  \pj\,  and  /Pj+3  <  fpj.  Since  pj  is  an 

optimal  schedule  of  T,  pj+y  is  also  an  optimal  schedule  of  T. 

How  the  adjustment  procedure  makes  the  finish  time  shorter  is  illustrated  by  Figures  8. 
In  Figure  8(i),  both  Ty  and  Tz  satisfy  condition  C3,  and  Tj  satisfies  condition  Cl.  The 
procedure  of  Cl  is  applied  to  all  these  three  tasks  and  makes  the  finish  time  shorter.  The 
dotted  task  window  frame  in  the  figure  indicates  that  hk  is  a  rejected  task.  In  Figure  8(ii), 
Cl  is  applied  to  the  qualifying  tasks  Ty  and  Tz-  And  by  C4,  T3  is  replaced  by  hk-  This, 
makes  the  finish  time  shorter.  While  it  is  hk  that  is  rejected  before  the  adjustment,  it  turns 
out  that  T3,  whose  computation  time  covers  the  whole  window  of  hk,  is  rejected  after  the 
adjustment. 

So  far,  we  have  shown  that  pj+y  conforms  ic>  Aj+1,  and  that  pj+y  is  also  an  optimal 
schedule  of  T.  The  theorem  is  thus  verified  by  the  induction.  It  deserves  our  attention  chat 
we  do  not  really  apply  the  swapping  and  replacing  procedure  to  any  schedule.  We  just  want 
to  show  the  existence  of  the  optimal  schedule  which  is  pj+y  in  the  context.  To  make  it  clear, 
the  structure  of  the  theorem  is  illustrated  in  Figure  9.  D 

3.4.3  Corresponding  Lemmas 

The  lemmas  used  by  the  conformation  theorem  are  demonstrated  as  follows 
Lemma  5  Bk_zj  D  Bk-y,k  =  Bk-y,k  n  Bjzy,k  =  Bk_lk  n  B^k  =  0 
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Figure  8:  The  swapping  and  replaning  procedure  Cl,  C3  and  C4. 

Bki H  Bk,k+ i  =  Bk±+ i  n  •%,*+]  =  -®fc.I+T  n  =  ®- 

PROOF:  We  first  show  that  B^j  n  B*_i.*  =  0.  Given  any  task  7V  €  Ty  does  not 

contain  h*  by  definition.  Bence,  7„  does  not  belong  to  Br.-i.k-  So  Bk_xj  n  Bk-i,k  =  0-  Tiie 
others  can  be  proved  similarly.  D 


Lemma  6  Bk-i.i  U  Bjzi  k  —  Bkj+$  U  Bk,k+ i- 

PROOF:  We  first  prove  that  if  a  task  Ty  €  Bk-\,k  U  Bjzi,k->  then  Ty  €  Bkj+ y  U  Bk,k+ i- 
Because  Ty  contains  hk  by  definition,  Ty  must  have  a  location  after  hk  too  by  rule  R2. 
Ty  €  Bk7ij  u  Bjbjk+1  u  BItt41.  Tv  does  not  belong  to  BI>+1,  so  Tv  €  Bkj^  U  Bk,*+i-  We 
can  prove  similarly  that  if  a  task  Ty  €  BkjZ\  U  Bk,k-r i,  ^hen  Ty  €  Bk- U  Bk_i  k-  So 
B'r.—I.k  u  B^n.fc  =  Bk£Z T  U  • 
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Figure  9:  The  Structure  of  the  Conformation  Theorem 
Lemma  7  If  and  only  if  Ty  €  Bk_  U  Bk-i,k  U  Bjzi,k  u  u  Bk,k+i u  Bj^+\  then  Ty  €  Si  j. 

PROOF:  We  will  first  prove  the  Tif”  part.  By  Lemma  6,  Bk~\,k  U  Bjzi,k  =  Bkj^  .  So 

we  only  need  to  check  against  Bk_lk  U Bk-i,k  U Bjzij,  U .  If  Tv  €  Bk_l  k  or  Ty  € 
then  Ty  has  a  location  between  the  top  tasks  and  g2  in  Ar  This  is  because  Ty  contains 
hk- 1,  Ty  has  a  location  after  by  rule  R2.  So  Ty  €  f 1j.  Then  consider  Ty  €  Bj—^^-  Ty 
is  either  a  top  task  or  a  non  top  task  in  A j.  If  Ty  is  a  top  task  in  Aj,  Ty  must  be  one  of 
the  g,,  i  —  1  by  the  definition  of  g,-.  If  Ty  is  a  nontop  task  in  A,,  it  must  contain  at 

least  one  top  task,  which  is  a  top  task  among  g2, . . .  ,gm,  or  hk+2  by  referring  to  Equation  7. 
Notice  that  Ty  must  not  contain  hk- 1  since  Ty  €  Bjzi^-  hT°  matter  whether  Ty  is  a  top  or 
nontop  task  in  Aj,  Ty  has  a  location  in  Slj  by  rule  R2.  If  Ty  €  Bkk^ ,  then  Ty  has  a  location 
between  the  top  tasks  gm  and  hk+2  in  Aj  by  rule  R2.  So  Ty  €  Slj. 
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Now  we  prove  the  ’’only  if”  part.  Let  Ty  be  a  task  located  in  fl;-.  If  Ty  is  one  of  the  top 
tasks  of  pi , . . .  ,pm,  Tv  contains  hk.  Either  Ty  €  Bk- i,jt  or  Ty  €  Otherwise,  Ty  is  a  non¬ 
top  task  of  A,-.  By  rule  R2,  Ty  contains  at  least  one  of  the  top  tasks  of  . .  ,gm,  A*+a. 

If  !Ty  contains  one  of  pi, . . .  then  Tv  also  contains  So  Ty  contains  either  hk- 1,  or  A*, 
or  hfc+1.  By  rule  R2,  Ty  should  fall  between  hk-i  and  hk ,  and/or  between  hk  and  So 

^  &k~\J  O  Bk~\,k  U  B-£Zi  k  U  0  U  Bkk+1 ■  E 

This  lemma  means  that  the  tasks  in  ftj+1  —  h*  are  exactly  the  same  tasks  which  axe  in 

ft;. 

Lemma  8  If  Ty  €  ft;,  Ty  contains  hk. 

PROOF:  We  would  like  to  show  that  if  Ty  does  not  contain  hk,  then  Ty  does  not  belong  to 
Cl'y  Since  p,-,  i  —  contains  hk,  that  Ty  does  not  contain  hk  means  that  Ty  does 

not  contain  g;  either.  Hence  Ty  can  not  have  a  location  in  the  subsequence  between  gi  and 
5t7.  in  ft;.  The  only  possible  locations  of  Ty  to  fall  in  ft'-  are  either  between  and  p1?  or 
between  gm  and  hk+x.  If  Ty  falls  between  hk^  and  p1;  that  Ty  does  not  contain  pa  implies 
that  Ty  contains  hi_j  by  rule  R2.  That  is  Ty  €  Bk_^.  A  nontop  task  cannot  have  duplicate 
positions  in  the  same  region  between  two  adjacent  top  tasks.  Bk_lj  is  located  between  hk~y 
2nd  9i  by  Equation  8.  Ty  does  not  have  a  location  in  the  head  of  ft'-  before  pi .  For  the  same 
reason,  Ty  does  not  have  a  location  in  the  tail  of  ft'  after  gm.  So  Ty  does  not  belong  to  ft'-. 
Therefore,  if  Ty  €  ft',  Ty  contains  hk.  □ 

Lemma  9  If  Ty  €  ft',Ty  €  Bk-  lf*  U  B—Uk  =  BkJ^  U 

PROOF:  If  Ty  €  ft;,  then  Ty  €  ft;.  By  Lemma  7,  Ty  €  Bk_lk  U  Bk^k  U  ?Jk  U  Bkj^  U 
£t,k+ 1  We  know  that  Ty  contains  hk  by  Lemma  8,  so  Ty  €  U  B-^^  U  Bkj^  U 

Bk,k+\  ■  Also  by  Lemma  6,  we  have  Ty  €  U  Bjzi,k  =  Bkj+i  U  Bk>k+x .  O 

Lemma  10  Assume  that  S  =  Lpre®BSLpest  is  a  feasible  sequence,  where  L  —  (T~, ,  Tx? . ...  ,T~ 
If  there  exists  a  sequence  X  =  (TVJ, TK, . . .  ,TVi)  such  that  X  is  a  permutation  of  T  and  the 
tasks  of  X  are  ordered  by  the  weakly  leading  relation.  We  have  /L^_  eL©L,*„  —  fLprceL&Lpe,f 
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PROOF:  We  bubble  sort  the  tasks  of  L  in  weakly  leading  order.  The  swapping  only  occurs 
between  two  adjacent  tasks.  For  each  swapping,  we  apply  the  Leading  Theorem  to  the  adja¬ 
cent  tasks,  which  correspond  to  Tp  and  Ta  respectively  in  the  theorem.  No  other  tasks  lie  in 
between  the  two  tasks  during  each  individual  swapping.  So  the  finish  time  of  the  resultant 
schedule  is  not  greater  than  that  of  the  original  schedule  according  to  the  Leading  Theorem. 
□ 

Lemma  11  A  task  Ty  £  should  satisfy  one  of  the  conditions  Cl,  C2,  C3  or  C4. 

PROOF:  If  Ty  €  then  Ty  €  Bk~i,k  U  by  Lemma  9,  which  implies  that  Ty  L)  h*.  We 
have  ry  <  r^k  and  dv  >  dhk-  There  are  four  possibilities. 

(i)  sy  <  rhk  and  fy  >  <4k:  C4  is  satisfied. 

(ii)  -Sy  >  rhk  and  fy  <  dhk'-  C3  is  satisfied. 

(iii)  sy  <  r^k  and  fy  <  dhk:  Cl  is  satisfied. 

(iv)  sy  >  rkk  and  fy  >  dkk:  C2  is  satisfied. 

□ 

3.5  Set- Scheduler  Algorithm 

By  Conformation  Theorem,  we  have  shown  that  there  exists  an  optimal  schedule  which 
conforms  to  the  super  sequence  A.  Hence,  we  can  use  Sequence-Scheduler  to  schedule  for 
each  instance  in  the  super  sequence,  and  pick  up  the  best  one.  Since  Sequence- Scheduler 
obtains  the  optimal  schedule  for  each  instance,  we  end  up  with  the  optimal  schedule  for 
the  task  set.  The  algorithm  for  scheduling  a  task  set  is  given  in  Figure  10.  The  Sequence- 
Scheduler  takes  0(n2)  time  for  each  instance,  while  there  are 

*  =  5(9+1)”' 

instances  to  check  in  the  super  sequence  as  illustrated  in  the  previous  section.  The  time 
complexity  of  Set-Scheduler  algorithm  is  thus  0(N  *  n2). 


249 


Algorithm  Set-Scheduler: 


Input:  a  task  set  F  =  {Ti,T2,  ...,7k} 

Output:  the  optimal  schedule  p  for  F 
compute  the  super  sequence  A  for  F 
/>:=  0 

for  each  instance  I  in  the  super  sequence  A 

invoke  Sequence- Scheduler  to  compute  the  optimal  schedule  a(n)  of  I 
if  (|p|  <  |a(n)|)  or 

(IpI  =  k(n)|  and  f„  >  /ff(n)) 
p  a(n) 
endif 
endfor 


Figure  10:  Set-Scheduler  Algorithm 


4  Evaluation 

Experiments  are  conducted  to  compare  the  performance  of  Set- Scheduler  with  those  of  the 
well-known  Eaxliest-Deadline-First  and  Least-Laxity-First  heuristic  algorithms.  The  rela¬ 
tions  among  the  tasks  are  important  for  the  schedulability  of  the  tasks.  To  study  the  dif¬ 
ferences  between  different  cases,  we  allow  the  variation  of  the  computation  times,  and  the 
interarrival  times,  which  are  the  time  intervals  between  the  ready  times  of  two  consecutive 
tasks.  Tasks  in  a  task  set  cue  generated  in  non-descending  order  by  their  ready  times.  The 
parameters  of  the  experiments  are  random  variables  with  truncated  normal  distribution,  as 
shown  in  Figure  11.  If  the  computation  time  of  a  task  is  greater  than  its  window  length,  the 
computation  time  is  truncated  to  its  window  length.  Such  a  truncation  is  not  applied  to  the 
interarrival  times. 

The  mean  of  Window  is  fixed.  Computation  time  ratio  is  the  ratio  of  the  computation 
time  to  the  window  length.  The  mean  of  Interarrival  time  ranges  from  10%  to  100%  of  the 
mean  of  Window.  The  standard  deviation  of  these  three  random  variables  are  set  to  be 
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I 


parameters 

mean 

Window  length 

10.0 

Computation  time  ratio 

0.25  0.5  0.75 

Interarrival  time 

1.0,  2.0,  ...,  10.0 

Figure  11:  Parameters  of  the  experiments 


20%,  50%,  and  80%  of  their  means.  For  simplicity,  the  ratios  of  the  three  random  variables 
are  set  to  be  the  same  for  each  individual  experiment.  For  each  experiment  with  different 
parameters,  100  task  sets,  each  with  12  tasks,  are  generated  for  scheduling. 

We  compare  the  performance  of  these  algorithms  by  (1)  Percentage  of  accepted  tasks: 
the  number  of  accepted  tasks  by  the  algorithm  over  the  number  of  the  tasks  of  the  optimal 
schedule  by  exhaustive  search;  (2)  Success  ratio:  the  number  of  times  that  the  algorithm 
comes  up  with  an  optimal  schedule  in  the  100  task  sets;  and  (3)  Comparisons  per  task  set: 
the  number  of  comparisons  per  task  set  that  each  algorithm  takes.  When  interarrival  times 
are  small,  more  containing  relations  among  tasks  are  likely  to  happen.  Figure  12  shows  that 
the  heuristic  algorithms  perform  worse  under  this  condition  and  tend  to  reject  more  tasks, 
especially  when  the  computation  time  ratio  is  larger.  Set-Scheduler  always  reaches  100% 
acceptance  rate  since  it  is  an  optimal  scheduler.  In  the  figure,  because  the  characteristics 
of  the  data  with  different  standard  deviation  ratios  are  similar,  only  the  data  with  standard 
deviation  ratio  equal  to  0.8  are  depicted.  "When  success  ratio  is  concerned,  which  can  be  seen 
in  Figure  4,  the  heuristic  algorithms  performs  even  worse.  Generally  speaking,  the  heuristic 
algorithms  can  usually  produce  suboptimal  schedules,  but  fail  to  produce  the  optimal  ones 
most  of  the  time.  The  search  space  is  shown  in  Figure  "4.  Set-Scneduler  performs  well  at 
the  expense  of  the  complexity,  which  may  become  very  large  when  the  interarrival  times  are 
small.  The  cost  is  more  reasonable  while  the  interarrival  times  between  tasks  are  not  too 
small. 
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Complexity 


Figure  14:  Number  of  comparisons  per  task  set 

5  Conclusion  Remarks  and  Future  Work 

In  tins  paper,  we  discuss  the  optimization  techniques  in  real-time  scheduling  for  aperi¬ 
odic  tasks  in  a  uniprocessor  system  with  the  nori-preemptive  discipline.  We  first  propose 
a  Sequence- Scheduler  algorithm  to  compute  the  optimal  schedule  for  a  sequence  in  0(n2) 
time.  Then  a  Set-Scheduler  algorithm  is  proposed  based  on  the  super  sequence  and  Sequence- 
Scheduler  algorithm.  The  complexity  of  our  Set-Scheduler  algorithm  is  0(N  *n2),  compared 
to  0(N  *  n)  for  the  feasibility  test  by  Erschler  et  cl.,  where  N  might  be  as  large  as  expo¬ 
nential  in  the  worst  case.  However,  our  simulation  results  show  that  the  cost  is  reasonable 
for  the  average  case.  We  explore  the  temporal  properties  concerning  the  optimization  issues, 
and  present  several  theorems  to  formalize  the  results.  The  study  of  temporal  properties  on 
a  uniprocessor  may  serve  as  a  base  for  the  more  complex  cases  in  multiprocessor  systems. 

For  the  future  work,  we  propose  to  incorporate  the  decomposition  technique  [18]  into 
our  scheduling  algorithm.  Under  this  approach  a  task  set  can  be  decomposed  into  subsets, 
which  results  in  backtracking  points  to  reduce  the  search  space.  This  has  been  shown  to  be 


useful  in  reducing  the  search  space  substantially  when  the  task  set  is  well  decomposable. 
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ABSTRACT 

Let  us  consider  the  problem  of  scheduling  a  set  of  n  tasks  on  a  single  processor 
such  that  a  feasible  schedule  which  satisfies  the  time  constraints  of  each  task  is 
generated.  It  is  recognized  that  an  exhaustive  search  may  be  required  to  generate 
such  a  feasible  schedule  or  to  assure  that  there  does  not  exists  one.  In  that  case 
the  computational  complexity  of  the  search  is  of  the  order  n!. 

We  propose  to  generate  the  feasible  schedule  in  two  steps.  In  the  first  step 
we  decompose  the  set  of  tasks  into  m  subsets  by  analyzing  their  ready  times  and 
deadlines.  An  ordering  of  these  subsets  is  also  specified  such  that  in  a  feasible 
schedule  ail  tasks  in  an  earlier  subset  in  the  ordering  appears  before  tasks  in  a  later 
subset.  With  no  simplification  of  scheduling  of  tasks  in  a  subset,  the  scheduling 
complexity  is  0(E£Li  n>'-)i  where  n.  is  the  number  of  tasks  in  the  zth  subset. 

The  improvement  of  this  approach  towards  reducing  the  scheduling  complexity 
depends  on  the  the  number  and  the  size  of  subsets  generated.  Experimental  results 
indicates  that  significant  improvement  can  be  expected  in  most  situations. 
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I  Introduction 

Consider  the  problem  of  nonpreemptive  scheduling  of  n  tasks  on  a  single  CPU  of  a  hard 
real-time  system.  For  task  Ti,  identified  as  i,  the  scheduling  request  consists  of  a  triple  <  c,-, 
i"i,  d;  >  where  c;  is  the  computation  time,  r,-  the  ready  time  before  which  task  i  can  not 
start,  and  d,-  the  deadline  before  which  the  computation  must  be  completed.  Time  interval 
[r,-,  d;}  is  called  the  time  window  denoted  by  to,-.  The  window  length  |tu,j  is  d,-  —  r,-.  In  a 
hard  real-time  system,  a  schedule  is  called  feasible  if  all  tasks  are  processed  within  their 
individual  windows. 

The  result  of  the  scheduling  process  is  a  schedule  in  which  for  any  task  i,  a  start  time 
s,  and  a  finish  time  is  identified,  where  /,■  =  s;  -f  c,-.  Clearly,  a  schedule  is  feasible,  if  for 
every  task  i, 


Ti  <si<  di  -  a.  (1) 

The  scheduling  process  is  not  preemptive  only  if  for  any  two  tasks  i  and  j, 

si  ^  sj  ^  si  "h  C,  <  Sj.  (2) 

In  other  words,  when  task  i  is  scheduled,  a  span  of  nonpreemptable  processing  time, 
c,-,  is  allocated  for  it.  No  other  task  may  be  in  execution  during  that  time  span.  Thus  the 
scheduling  problem  is  to  find  a  mapping  from  a  task  set  {i}  to  a  start  time  set  {s,},  such 
that  constraints  in  (1)  and  (2)  are  met.  Note  that  for  a  given  set  of  tasks  {:},  there  may¬ 
be  none,  one  or  many  feasible  schedules. 

In  general  the  nonpreemptive  real-time  scheduling  problem  is  known  to  be  NP-complete 
[Gare79j.  To  find  a  feasible  schedule,  the  number  of  schedules  to  be  examined  is  0(n!), 
which  we  count  as  the  scheduling  complexity.  Heuristic  techniques  can  be  used  [Ma84, 
McMa75,  MokS3,  ZhaoST]  to  reduce  the  complexity.  This  reduction,  however,  is  achieved 
at  the  cost  of  obtaining  a  potentially  sub-optimal  solution.  That  is,  when  looking  for  feasible 
schedules,  heuristic  techniques  may  not  yield  a  feasible  schedule,  even  though  one  exists. 
Schedules  based  on  the  earliest-deadline- first,  or  minimum-laxity -first  rules  are  examples  of 
such  heuristics  used  in  scheduling. 

An  alternate  approach  is  to  develop  analytical  methods  for  scheduling  [ErscS3,  Liu73]. 
This  approach  analyzes  the  relationships  among  real-time  tasks  and  schedules.  The  purpose 
is  to  precisely  determine  optimal  task  schedules,  or  narrow  the  search  scope  from  the  original 
search  space. 

The  objective  of  this  research  is  to  develop  correct  and  efficient  algorithms  for  nonpre¬ 
emptive  real-time  scheduling.  We  call  a  scheduling  algorithm  correct ,  if  whenever  a  feasible 
schedule  exists,  the  algorithm  can  find  it. 
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In  this  paper,  we  present,  an  analytical  decomposition  approach  for  real-time  scheduling. 
The  strategy  is  to  divide  a  set  of  tasks  into  a  sequence  of  subsets,  such  that  the  search  for 
feasible  schedules  is  only  performed  within  each  subset.  The  decomposition  technique  used 
for  generating  the  sequence  of  subsets  assures  that  in  a  feasible  schedule  all  tasks  in  a  subset 
earlier  in  the  sequence  are  scheduled  before  any  task  in  a  later  subset.  Backtracking  in  the 
search  is  bounded  within  each  subset,  which  significantly  reduces  the  scheduling  complexity. 

There  are  several  different  strategies  which  can  be  used  to  subset  tasks.  The  decompo¬ 
sition  strategy  discussed  in  this  paper  is  to  use  a  relation  called  the  leading  relation  which 
depends  on  the  tasks’  relative  window  positions. 

We  performed  an  experiment  which  examined  the  number  and  size  of  subsets  with 
regard  to  the  number  of  tasks,  task  arrival  rate,  and  window  length.  We  found  that,  in 
general,  the  number  of  tasks  in  any  subset  is  independent  of  the  total  number  of  tasks  to 
be  scheduled,  if  the  task  window  lengths  are  bounded.  The  decomposition  scheduling  is  a 
polynomial  computation.  As  a  consequence,  the  decomposition  method  is  very  practical  for 
the  implementation. 

In  section  II  we  present  some  basic  notions  used  in  the  paper.  In  section  III  we  discuss 
a  case  where  all  the  tasks  have  the  leading  relation  with  each  other.  Our  approach  of 
decomposition  scheduling  is  introduced  in  section  IV  along  with  concepts  of  the  single 
schedule  subset  and  decomposed  leading  schedule  sequence.  We  present  our  experiment 
results  in  section  V.  Our  conclusion  and  future  research  in  section  VII. 

II  Background 

If  we  consider  any  two  tasks  i  and  j,  they  must  have  one  of  these  three  relations: 

1.  leading  -  i  •<  j  (or  j  -<  i),  where  if  r,  <  rJ5  dj  <  dj  and  tc,-  ^  Wj. 

2.  matching  -  i\\j,  if  r,-  =  Tj  and  dj  =  dj. 

3.  containing  -  i  LI  j  (or  j  L)  :),  .if  r,  <  Tj  and  dj  <  dj. 

These  three  relations  are  shown  in  Fig.  1.  It  is  easy  to  see  that  the  leading,  matching 
and  containing  relations  are  ail  transitive.  Additionally,  if  i|jj  or  i  U  j,  we  say  that  i  and  j 
are  concurrent. 

A  length  is  associated  with  a  schedule  which  is  the  finish  time  of  the  last  task  in  the 
schedule.  One  example  is  shown  in  Fig.  2. 

The  concept  of  dominance  was  introduced  in  [ErscS3],  and  we  will  use  it  later  in  the 
discussion. 

Definition  1  For  two  schedules  5j  and  S?,  Si  dominates  S 2  if  and  only  if: 

S2  feasible  =*•  Si  feasible. 
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1.  Leading 


2.  Matching 


3.  Containing 


time 


Figure  1:  Task  window  relations 

Definition  2  A  set  of  schedules.  5,  is  dominant  if  V52  £5,  35]  65  such  that  5]  dominates 

52. 

A  schedule  is  dominant  if  it  dominates  all  other  schedules. 

Ill  The  Leading  Schedule  Sequence 

Let  us  consider  the  case  where  for  a  set  of  task  {i},  every  pair  of  tasks  in  this  set  has  a 
leading  relation,  i.e.  i  •<  j  or  j  -<  for  every  j,  i  ^  j. 

Based  on  the  leading  relation  we  can  define  a  icrfcaJ  order  of  tasks  for  the  set.  We  define 
the  leading  schedule  sequence  ( LSS )  to  be  a  sequence  c£  -taste  in  which  tasks  are  in  order 
according  to  the  leading  relation,  that  is,  for  any  i  and  j,  i  A  j  <=>  i  •<  j,  where  i  -L  j 
means  that  i  is  scheduled  in  front  of  j. 

Theorem  1  For  a  set  of  tasks  all  of  which  have  a  pairwise  )e&ciir$  rda-fcton,  the  schedule 
where  tasks  are  sequenced  in  order  of  the  leading  schedule  sequence  is  a  dominant  one. 

Proof:  We  prove  this  theorem  by  construction. 


1 


2 


3 


time 


schedule  length 


■> 


Figure  2:  An  example  of  one  schedule 

Suppose  this  set  of  tasks  has  a  feasible  schedule  5  in  which  tasks  occur  in  a  different 
sequence  than  the  leading  schedule  sequence.  When  we  examine  this  schedule,  let  i  and  j 
be  the  first  pair  of  tasks  that  are  not  ordered  by  the  leading  relation,  i.e.  i  -<  j,  but  j  A  i. 
From  the  leading  relation  we  know  that  r,-  <  Tj  and  d,  <  dj. 

Since  j  and  i  are  the  first  such  pair,  deadlines  of  all  tasks  between  j  and  i  in  5  must 
be  greater  than  or  equal  to  d:  as  well  as  d,.  In  5,  let  us  construct  another  schedule  S'  by 
moving  i  from  the  current  position  in  5  to  the  position  just  in  front  of  j.  The  start  time 
and  finish  time  of  tasks  between  j  and  i  including  j  will  be  increased  by  no  more  than  c,\ 
And  so,  no  task  between  j  and  i  including  j  will  finish  later  than  d,-.  Meanwhile,  the  rest  of 
this  schedule  is  unchanged.  Thus  if  5  is  feasible,  the  new  schedule  S'  will  be  feasible  too. 

By  repeating  the  process  of  constructing  S'  from  5,  we  obtain  a  schedule  which  has  all 
tasks  ordered  according  to  the  leading  relation,  such  that  if  the  original  schedule  is  feasible, 
so  is  the  constructed  one.  □ 

Thus  if  there  exists  a  set  of  feasible  schedules,  the  set  must  contain  schedules  that  are 
conforming  to  the  order  of  the  leading  schedule  sequence.  The  result  can  be  generalized 
to  the  situation  where  there  exist  matching  windows.  The  generalization  is  to  combine 
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the  tasks  with  the  same  window  into  one  task  whose  computation  time  is  the  sum  of  the 
computation  times  of  all  these  tasks. 

We  will  see  that  the  above  leading  schedule  sequence  is  a  special  case  of  the  decomposed 
leading  schedule  sequence  introduced  in  the  next  section. 

IV  Task  Decomposition 

A.  Philosophy 

To  solve  the  general  real-time  scheduling  problem  with  n  tasks,  the  number  of  schedules  to 
be  examined  can  be  as  much  as  0(n!).  However,  taking  a  closer  look,  we  find  that  every 
task  has  an  important  property  called  the  locality  of  a  task,  that  is,  a  task  is  time-bounded 
by  its  time  window.  Furthermore,  if  any  two  task  windows  are  not  overlapping,  there  is 
only  one  possible  order  for  them.  The  above  facts  motivate  us  to  separate  the  tasks  into 
subsets  according  to  their  different  time  localities. 

The  decomposition  scheduling  can  be  divided  into  two  steps:  decomposition  and  schedul¬ 
ing. 

First,  a  set  of  n  tasks  is  decomposed  into  a  sequence  of  m  subsets  such  that  the  orders  cf 
subsets  are  fixed.  The  order  of  a  task  is  determined  only  relative  to  the  other  tasks  within 
its  own  subset.  The  sequence  of  the  subsets  is  called  the  decomposed  schedule  sequence. 
The  decomposition  should  be  so  developed  that  the  schedulabilitv  of  tasks  is  not  damaged 
at  all.  The  decomposition  by  using  the  leading  relation  introduced  in  this  paper  shows  this 
property. 

The  second  step  is  to  schedule  the  subsets  in  the  sequence  order.  It  always  selects 
a  schedule  for  each  subset  with  the  shortest  length,  so  that  when  a  subset  is  scheduled, 
the  time  span  available  for  it  is  maximized.  In  this  way,  the  total  number  of  schedules 
to  be  examined  is  only  0(D£Li  n»!)i  where  n;  is  ibe  number  of  tasks  in  the  ith  subset 
(E£i  rii  =  n). 

The  only  remaining  problem  is  how  to  decompose  a  set  of  tasks  into  a  sequence  of 
subsets  of  tasks  such  that  a  feasible  schedule  is  guaranteed  to  be  found  if  one  exists.  In  the 
rest  of  this  paper,  we  outline  how  to  use  the  leading  relation  as  a  means  to  divide  the  task 
set. 

B.  Decomposition  Scheme 

A  set  of  tasks  is  called  the  single  schedule,  subset  (sss),  represented  as  r,  if 

V:  €  r  3j  6  r  (i  D  j )  V  (j  u  i)  v  (i\\j). 
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In  other  words,  each  task  window  is  contained  in  the  window  of  another  task,  contains 
the  window  of  another  task,  or  matches  the  window  of  another  task  in  the  subset. 

Given  a  set  of  tasks  {:},  we  can  decompose  it  into  a  sequence  of  single  schedule  subsets 
r1,  r2,  ■  -  •,  rk  such  that  all  the  tasks  in  r!  are  leading  to  all  the  tasks  in  r,+1. 

The  decomposed  leading  schedule  sequence  ( DLSS )  is  defined  to  be  a  sequence  of  single 
schedule  subsets,  denoted  as: 

DLSS  =  r1  o  t2  o  •  •  •  o  rm, 

such  that  V/;'  £  r*  V/;->  £  k'  -<  V ,  for  1  <  t  <  j  <  m,  (denoted  as  r’  -<  rJ),  and  r1  can  not 
be  further  decomposed,  for  i  =  Symbol  o  represents  a  concatenating  operation. 

Note  that  if  a  task  in  r’  does  not  lead  another  task  in  tj  for  i  <  j ,  they  must  have 
a  matching  or  containing  relation.  If  this  happens,  r’  and  tj  can  not  be  different  single 
schedule  subsets.  Clearly,  all  n  tasks  may  belong  to  a  single  schedule  subset. 

Theorem  2  The  set  of  schedules  conforming  to  the  decomposed  leading  schedule  sequence 
is  dominant. 

Proof:  Assume  that  if  there  are  two  tasks  k'  £  r’  and  £  rJ,  where  r*  -<  rJ.  There  is  no 
common  concurrent  task  with  both  k 1  and  k’  is  positioned  in  front  of  k'  in  a  feasible 
schedule  (5).  Specifically,  5  =  (•••)  o  (k>  o  •  •  • k *)  o  (•  •  •).  Let  us  define  S’  =  o  •  •  •&')  for 
abbreviation  (5  =  (•••)o5'o(-  -  •)).  The  new  schedule  created  by  exchanging  k’-'s  position 
with  kJ  :s  is  still  feasible. 

Without  loss  of  generality,  suppose  that  k ’  and  are  the  first  such  pair  in  5.  Tasks 
between  kJ  and  k'  are  led  by  k: ,  or  concurrent  with  kJ,  but  not  leading  to  and  not  concurrent 
with  k\  Since  k'  -<  &  (i.e.  r,-  <  rj),  switching  k': s  and  k:'s  positions  wiD  not  increase  the 
finish  time  of  S',  which  is  defined  as  the  finish  time  of  the  last  task  in  S'.  All  the  tasks 
between  k ’  and  k>,  including  k},  are  led  by  k\  i.e.  having  deadlines  greater  than  or  equal 
to  dki.  If  5  is  feasible  with  kl  as  the  last  task  in  S',  it  will  be  still  feasible  after  the 
switching.  D 

Note  that  if  the  set  of  schedules  that  are  following  the  decomposed  leading  schedule 
sequence  is  empty,  there  is  no  feasible  schedule  available  for  the  tasks  to  be  scheduled. 

C.  Decomposition  Algorithm 

Decomposing  a  set  of  tasks  into  single  schedule  subsets,  the  algorithm  starts  with  the  tasks 
having  been  sorted  by  their  ready  times  (using  their  deadlines  if  their  ready  times  are  the 
same). 
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The  algorithm  uses  one  single  loop  to  determine  which  single  schedule  subset  the  cur¬ 
rent  task  should  belong  to.  The  loop  consists  of  two  parts.  The  first  part  is  a  while  loop 
which  merges  single  schedule  subsets  into  one,  if  the  current  task  is  contained  by  them.  The 
second  part  decides  whether  the  current  task  can  form  a  new  single  schedule  subset,  or  join 
with  another  single  schedule  subset. 


The  Leading-relation  Decomposition  Algorithm 
begin 

/*  Initialization.  */ 

k  =  l;  t1  =  {i}; 

rr>  =  ri ;  dr,  =  d  1 ; 

for  ss  2  to  n  do  /*  Go  over  the  task  list.  */ 

l  =  k  —  1;  /*  1  is  the  index  of  single  schedule  subsets.  */ 

continue  =  TRUE; 

while  (/  >  0)  A  ( continue )  do 

/*  Merge  single  schedule  subsets  if  the  current  task  is  concurrent 
with  tasks  in  different  subsets.  */ 
if  (dr,  >  di) 

T1  =  Tk  U  Tl] 

dr,  -  dTk\ 

*  =  /; 
else 

continue  =  FALSE; 

i  =  i-  i; 

od 

if  (r.k  <  T{)  A  (d,  <  dTk) 

/*  The  current  task  is  concurrent  with  tasks  in  the  current  subset.*/ 

r*  =  r*  U  {:}; 

else  if  (rTk  <  r,)  A  (dT*  <  d,-) 

/*  The  current  task  is  led  by  all  the  tasks  in  the  current  subset. 

A  new  single  schedule  subset  is  created  only  containing  the  current  task.*/ 
k  =  k  +  1; 
rk  = 

Trk  -  r,-; 
dTk  =  di. 
od 

end 
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In  this  algorithm.  the  outer  loop  is  executed  n  times.  The  while  loop  is  executed  no 
more  than  the  number  of  time  proportional  to  n  in  total,  since  no  more  than  n  subsets  can 
be  merged  during  the  whole  execution  of  the  algorithm  with  n  tasks.  Thus  the  complexity 
of  this  algorithm  is  only  O(n).  If  we  count  in  the  sorting  complexity,  the  decomposition 
will  cost  no  more  than  O(niogn). 


D.  Scheduling  Scheme 


After  tasks  has  been  decomposed  into  a  sequence  of  subsets,  scheduling  should  be  performed 
on  each  subset  in  the  sequence  order,  such  that  the  schedule  on  each  subset  is  of  the 
shortest  length.  A  brute  force  method  is  to  give  an  exhaustive  search  whose  computational 
complexity  amounts  to  0(n,!),  where  nt-  is  the  number  of  tasks  in  the  ith  subset. 

In  [Yuan89b],  other  scheme  is  explored  for  scheduling  a  subset.  The  method  is  to  first 
build  a  super-sequence  where  tasks  may  have  several  occurrences.  The  occurrence  of  a  task 
is  decided  by  its  relative  window  position  in  the  subset.  Selecting  one  occurrence  for  every 


task  in  the  super- sequence  forms  a  schedule.  A  complete  search  costs  0{nj  n'  '  )  in  the 

j  ■  ^  1 . 

worst  case.  When  we  made  a  few  calculation  samples  of  n,-  ’  ’  with  n;  less  than  100, 

-(n  — ) 

nf  '  is  a  much  smaller  number  than  n,-!,  as  shown  in  the  cited  paper. 

Since  the  set  of  schedules  following  the  decomposed  leading  schedule  sequence  is  domi¬ 
nant,  and  since  the  subsets  are  scheduled  in  the  sequence  order  with  their  shortest  length,  it 
is  proved  that  the  decomposition  scheduling  with  the  leading  relation  is  correct  [YuanS9b]. 


V  Empirical  Study 

A.  Experiment 

In  order  to  observe  the  behavior  of  the  number  of  tasks  in  a  single  schedule  subset  and 
number  of  the  subsets  to  be  created  with  regard  to  the  number  of  tasks  to  be  scheduled, 
task  arrival  rate,  and  task  window  length,  we  conduct  an  experiment  as  an  example  to  see 
the  feasibility  of  our  approach  for  practical  implementation. 

The  outputs  we  are  interested  in  are: 

1.  the  number  of  single  schedule  subsets  (s<ss), 

2.  the  number  of  window  concurrences, 

3.  the  maximum  number  of  tasks  in  single  schedule  subsets, 

4.  the  minimum  number  of  tasks  in  single  schedule  subsets,  and 
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5.  the  average  number  of  tasks  in  single  schedule  subsets. 

One  window  concurrence  is  counted  for  any  two  tasks  i  and  j  if  i  and  j  have  a  concurrent 
relation.  We  call  the  number  of  tasks  in  a  single  schedule  subset  as  the  size  of  u  subset. 

Meanwhile,  we  change  the  following  parameters  independently  to  watch  the  changes  in 
the  outputs, 

1.  the  number  of  total  tasks, 

2.  task  arrival  rate,  and 

3.  window  length. 

The  data  is  shown  in  Table  1-41  in  the  end  of  this  paper.  Following  axe  basic  rules  in 
the  experiment. 

1.  The  computation  time  is  uniformly  distributed  over  (0,aj. 

2.  The  task  interarrival  is  uniformly  distributed  over  (0,/3).  The  arrival  rate  is  2//?. 

3.  The  window  length  is  also  randomly  created  by  controlling  the  laxity  for  each  task. 
The  laxity  of  a  task  is  the  difference  between  its  window  length  and  its  computation 
time.  The  laxity  is  uniformly  distributed  [0,  7).  The  distribution  guarantees  the 
window  length  greater  than  the  computation  time  for  the  task. 

We  notice  that  the  arrival  rate  should  be  less  than  or  equal  to  the  service  rate,  otherwise, 
there  are  congestions  in  the  system,  which  will  result  in  deadline-missing.  In  other  words, 
213  <  2/a.  That  is, 

0  <  3. 

The  random  numbers  are  provided  by  function  drandQ  in  the  UNIX  operating  system. 
The  numbers  are  uniformly  distributed  over  [0,  1)  (StevS6]. 

In  the  experiment,  we  found  that  the  minimum  size  of  single  schedule  subsets  is  always 
one. 

B.  The  Result  Explanation  and  Observation 

From  the  experiment  results  ,  we  found  that  when  the  average  window  length  increases 
(7  increases),  the  number  of  single  schedule  subset's  reduces  and  the  maximum  size  of 
single  schedule  subsets  slightly  increases.  The  result  is  expected,  since  the  larger  some  task 

‘In  the  tables,  number  is  represented  by  num.  Window  by  W.  Concurrences  by  concurr.  Average  by 
avg.  The  Single  schedule  subset  by  sss. 


windows  s.r e.  the  more  tasks  may  be  concurrent  with  tiiem.  These  t.asks  may  be  in  the  same 
single  schedule  subset. 

When  P  increases,  that  is,  the  arrival  rate  decreases,  the  number  of  single  schedule 
subsets  increases,  and  maximum  size  of  single  schedule  subsets  decreases.  The  result  is  also 
expected,  since  when  the  arrival  rate  decreases,  the  opportunity  of  tasks  concurrent  with 
each  other  decreases  too.  Most  tasks  have  the  leading  relation  with  each  other. 


Figure  3:  The  relationship  between  the  size  of  single  schedule  subsets  and  the  number  of 
tasks  with  regard  to  the  laxity  parameter  7,  where  0  =  4,  P  =  4. 

Fig.  3  shows  the  relationship  between  the  maximum  size  of  single  schedule  subsets  and 
the  number  of  tasks  to  be  scheduled.  From  the  experiment,  we  found  that  the  size  of  a 
single  schedule  set  never  exceeds  14  even  when  there  are  300  tasks  being  scheduled.  The 
observation  indicates  that  for  most  cases  mai^L^n;)  is  a  constant. 

We  show  the  relationship  between  the  number  of  subsets  (m)  and  number  of  tasks  to 
be  scheduled  in  Fig.  4  and  Fig.  5  with  regard  to  different  window  length  and  arrival  rate 
distributions. 
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m 


Figure  4:  The  relationship  between  the  number  of  single  schedule  subsets  and  the  number 
of  tasks  with  regard  to  the  laxity  parameter  7,  where  q  =  4,  /?  =  4. 

VI  Final  Remarks 

In  this  paper,  we  examine  the  problem  of  nonpreemptive  scheduling  of  n  tasks  on  a  single 
CPU  in  hard  real-time  systems.  We  propose  a  correct  decomposition  strategy  for  the 
scheduling.  The  strategy  significantly  reduces  the  scheduling  complexity  for  most  cases. 

In  this  paper  we  have  examined  a  decomposition  technique  based  only  on  the  windows 
of  tasks.  By  taking  into  account  the  computation  time  requirements,  the  decomposition 
can  be  made  stronger  [Yuan89a].  The  decomposition  approach  may  also  be  extended  to 
consider  precedence  and  other  dependences  among  tasks.  This  aspect  of  decomposition 
technique  needs  further  study. 
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Figure  5:  The  relationship  between  the  number  of  single  schedule  subsets  and  the  number 
of  tasks  with  regard  to  the  arrival  rate  parameter  /3,  where  a  =  4,  7  =  2. 

experiment. 
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Table  1:  a  —  4,  0  =  4 


j  num  cf 
.  tasks 


avg  W.  num  of 


t  s.^s  | 


45 


35 


25 


36 


30 


82 


74 


68 


55 


41 


IIS 


114 


92 


73 


6S 


159 


144 


116 


113  | 


81 


sss  size 
max  j  avg  i 


2  1.11 
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Table  3:  q  =  4,  0  =  8 


num  of 

tasks 

■ 

avg  W. 
length 

H 

sss  size  | 

max 

avg 

2 

3.24 

9 

42 

3 

1.19 

4 

3.56 

5 

45 

3 

1.11 

6 

4.67 

40 

3 

1.25 

8 

5.95 

3 

47 

2 

1.06 

10 

6.51 

20 

32 

5 

1.56 

100 

2 

2.99 

6 

94 

2 

1.06 

D 

4.20 

13 

87 

3 

1.15 

6 

5.04 

26 

75 

3 

1.33 

8 

5.77 

29 

73 

5 

1.37 

10 

6.91 

33 

70 

5 

1.43 

150 

2 

2.99 

23 

127 

mm 

1.18 

4 

4.01 

2S 

126 

4 

1.19 

6 

5.27 

28 

122 

3 

1.23 

8 

6.00 

32 

120 

5 

1.25 

10 

7.22 

39 

112 

6 

1.34 

200 

2 

3.09 

19 

IS  2 

3 

1.10 

4 

3.94 

16 

182 

3 

1.10 

6 

4.87 

31 

171 

4 

1.17 

8 

5.91 

51 

154 

5 

1.30 

10 

6.96 

78 

130 

6 

1.54 

250 

2 

2.91 

23 

227 

3 

1.10 

4 

4.01 

48 

206 

4 

1.21 

6 

5.01 

52 

204 

5 

1.23 

8 

5.84  |  56 

195 

5 

1.28 

10 

7.48  |  60 

191 

4 

1.31 

300 

2 

2.99 

35 

267 

4  j  1.12 

4 

4.05  |  30 

270 

3 

1.11 

6 

5.07 

60 

241 

4 

1.24 

6 

5.99 

63 

240 

4 

1.25 

10 

7.13 

no 

199 

6 

1.51 
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Table  4:  q  =  4,  /?  =  10 

num  of 

tasks 

7 

avg  W. 
length 

num  of 

concurr. 

num  of 

sss  size  j 

sss 

max 

avg 

50 

2 

3.44 

2 

48 

2 

2.04 

4.11 

5 

45 

3 

1.11 

6 

5.50 

4 

46 

3 

1.09 

8 

6.69 

11 

41 

4 

1.22 

10 

6.61 

15 

38 

1.32 

100 

2 

3.02 

89 

1.12 

4 

4.01 

14 

87 

4 

1.15 

6 

4.95 

21 

80 

4 

1.25 

8 

6.59 

IS 

82 

3 

1.22 

10 

6.66 

25 

76 

5 

1.32 

2 

3.00 

6 

144 

2 

1.04 

4 

3.98 

11 

139 

3 

1.08  ! 

6 

4.72 

11 

139 

2 

1.06  | 

8  |  5.91 

37 

117 

5 

1.28 

43 

109 

6 

1.36 

200 

2  |  3.00  |  18  |  182 

3 

1.10  j 

4 

3.97  |  27  |  174 

5 

1.15  | 

6 

5.19 

23  |  177 

3 

1.13 

8  |  6.16  |  34  |  167 

< 

1.20  | 

10  |  6.90  j  52  |  153 

4 

1.31  I 

250 

j 

L2 

2.94  |  24 

226 

3 

1.10  | 

4  |  4.12  |  32  j  218  |  3  |  1.15  | 

6  ;  5.10 

43  |  2i3  |  4  |  1.17  | 

8  |  6.07 

45 

206  |  3  |  1.21  | 

10  |  6.96 

51 

i  200  |  4  |  1.25  j 

300 

2  |  3.05 

27  j  274  1  3  1  1.09  i 

4  |  4.07  |  21  j  280 

3  ,  1.07  | 

6  |  5.04  |  54  j  251 

6 

1.20  ! 

8 

6.16  |  46 

254 

3  |  1.18 

10 

6.61  j  61 

243 

8 

1.23 
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Abstract 

A  simple  approach  to  inter-domain  routing  is  domain-level  source  routing  with  link-state 
approach  where  each  node  maintains  a  domain-level  view  of  the  internetwork.  This  does  not  scale 
up  to  large  internetworks.  The  usual  scaling  technique  of  aggregating  domains  into  superdomains 
loses  ToS  and  policy  detail. 

We  present  a  new  viewserver  hierarchy  and  associated  protocols  that  (1)  satisfies  policy 
and  ToS  constraints,  (2)  adapts  to  dynamic  topology  changes  including  failures  that  partition 
domains,  and  (3)  scales  well  to  large  number  of  domains  without  losing  detail.  Domain-level 
views  are  maintained  by  special  nodes  called  viewservers.  Each  viewserver  maintains  a  domain- 
level  view  of  a  surrounding  precinct.  Viewservers  are  organized  hierarchically.  To  obtain  domain- 
level  source  routes,  the  views  of  one  or  more  viewservers  are  merged  (upto  a  maximum  of  twice 
the  levels  in  the  hierarchy). 

We  also  present  a  model  for  evaluating  inter-domain  routing  protocols,  and  apply  this  model 
to  compare  our  viewserver  hierarchy  against  the  simple  approach.  Our  results  indicate  that  the 
viewserver  hierarchy  finds  many  short  valid  paths  and  reduces  the  amount  of  memory  require¬ 
ment  by  two  orders  of  magnitude. 


Categories  and  Subject  Descriptors:  C.2.1  [Computer-Communication  Networks]:  Network  Archi¬ 
tecture  and  Design — packet  networks;  store  and  forward  networks;  C.2.2  [Computer-Communication  Net¬ 
works]:  Network  Protocols— protocol  architecture ;  C.2.m  [Routing  Protocols];  F.2.m  [Computer  Network 
Routing  Protocols]. 


*  This  work  is  supported  in  part  by  ARPA  and  Philips  Labs  under  contract  DASG60-92-0055  to  Department  of 
Computer  Science,  University  of  Maryland,  and  by  National  Science  Foundation  Grant  No.  NCR  89-04590.  The 
views,  opinions,  and/or  findings  contained  in  this  report  are  those  of  the  anthor(s)  and  should  not  be  interpreted  as 
representing  the  official  policies,  either  expressed  or  implied,  of  the  Advanced  Research  Projects  Agency,  PL,  or  the 
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1  Introduction 


A  computer  internetwork,  such  as  the  Internet,  is  an  interconnection  of  backbone  networks,  regional 
networks,  metropolitan  area  networks,  and  stub  networks  (campus  networks,  office  networks  and 
other  small  networks)1.  Stub  networks  are  the  producers  and  consumers  of  the  internetwork  traffic, 
while  backbones,  regionals,  and  MANs  are  transit  networks.  (Most  of  the  networks  in  an  internet¬ 
work  are  stub  networks.)  Each  network  consists  of  nodes  (hosts,  routers)  and  links.  Two  networks 
axe  neighbors  when  there  is  one  or  more  links  between  nodes  in  the  two  networks  (see  Figure  1). 


Figure  1:  A  portion  of  an  internetwork.  (Circles  represent  stub  networks.) 

An  internetwork  is  organized  into  domains 2 .  A  domain  is  a  set  of  networks  (possibly  consisting  of 
only  one  network)  administered  by  the  same  agency.  Within  each  domain,  an  intra-domain  routing 
protocol  is  executed  that  provides  routes  between  source  and  destination  nodes  in  the  domain.  This 
protocol  can  be  any  of  the  typical  ones,  i.e.,  next-hop  or  source  routes  computed  using  distance- 
vector  or  link-state  algorithms. 

Across  all  domains,  an  inter-domain  routing  protocol  is  executed  that  provides  routes  be¬ 
tween  source  and  destination  nodes  in  different  domains.  This  protocol  must  satisfy  various  con¬ 
straints: 

(1)  It  must  satisfy  policy  constraints,  which  are  administrative  restrictions  on  the  inter-domain 
traffic  [8,  12,  9,  5].  Policy  constraints  are  of  two  types:  transit  policies  and  source  policies. 
The  transit  policies  of  a  domain  A  specify  how  other  domains  can  use  the  resources  of  A 
(e.g.  SO. 01  per  packet,  no  traffic  from  domain  £).  The  source  policies  of  a  domain  A  specify 

1  Fox  example,  NSFNET,  MILNET  are  backbones  and  Smanet,  CerfNet  are  regionals. 

7  also  referred  to  as  routing  domains 
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constraints  on  traffic  originating  from  A  (e.g.  domains  to  avoid/prefer,  acceptable  connection 
cost).  Transit  policies  of  a  domain  are  public  (i.e.  available  to  other  domains),  whereas  source 
policies  are  usually  private. 

(2)  An  inter-domain  routing  protocol  must  also  satisfy  type-of-service  (ToS)  constraints  of  ap¬ 
plications  (e.g.  low  delajT,  high  throughput,  high  reliability,  Triiriirrmm  monetary  cost).  To  do 
this,  it  must  keep  track  of  the  types  of  services  offered  by  each  domain  [5]. 

(3)  Inter-domain  routing  protocols  must  scale  up  to  very  large  internetworks,  i.e.  with  a  very  large 
number  of  domains.  Practically  this  means  that  processing,  memory  and  communication 
requirements  should  be  much  less  than  linear  in  the  number  of  domains. 

(4)  Inter-domain  routing  protocols  must  automatically  adapt  to  link  cost  changes,  node/link 
failures  and  repairs  including  failures  that  partition  domains  [15].  It  must  also  handle  non- 
hieraxchical  domain  interconnections  at  any  level  [9]  (e.g.  we  do  not  want  to  hand-conffgure 
special  routes  as  “back-doors”). 

A  simple  (or  straightforward)  approach  to  inter-domain  routing  is  domain-level  source  routing 
with  link-stale  approach  [8,  5].  In  this  approach,  each  router3  maintains  a  domain-level  view  of  the 
internetwork,  i.e.,  a  graph  with  a  vertex  for  every  domain  and  an  edge  between  every  two  neighbor 
domains.  Policy  and  ToS  information  is  attached  to  the  vertices  and  the  edges  of  the  view. 

When  a  source  node  needs  to  reach  a  destination  node,  it  (or  a  router'4  in  the  source’s  domain) 
first  examines  this  view  and  determines  a  domain-level  source  route  satisfying  ToS  and  policy 
constraints,  i.e.,  a  sequence  of  domain  ids  starting  from  the  source’s  domain  and  ending  with  the 
destination’s  domain.  Then,  the  packets  are  routed  to  the  destination  using  this  domain-level 
source  route  and  the  intra-domain  routing  protocols  of  the  domains  crossed. 

The  disadvantage  of  this  simple  scheme  is  that  it  does  not  scale  up  for  large  internetworks.  The 
storage  at  each  router  is  proportional  to  No  x  Ed,  where  No  is  the  number  of  domains  and  Eo 
is  the  average  number  of  neighbor  domains  to  a  domain.  The  communication  cost  is  proportional 
to  Nr  x  Er ,  where  Nr  is  the  number  of  routers  in  the  internetwork  and  Er  is  the  average  router 
neighbors  of  a  router  (topology  changes  axe  flooded  to  all  routers  in  the  internetwork). 

To  achieve  scaling,  several  approaches  based  on  aggregating  domains  into  superdomains  have 

3  Not  all  nodes  maintain  routing  tables.  A  router  is  a  node  that  maintains  a  routing  table. 

*  referred  to  as  the  policy  server  in  [8] 
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been  proposed  [13,  16,  6].  This  approaches  have  drawbacks  because  the  aggregation  results  in  loss 
of  detail  (discussed  in  Section  2). 

Our  protocol 

In  this  paper,  we  present  an  inter-domain  routing  protocol  that  we  have  proposed  recently[3].  It 
combines  domain-level  views  with  a  novel  hierarchical  scheme.  It  scales  well  to  large  internetworks, 
and  does  not  suffer  from  the  problems  of  superdomains. 

In  our  scheme,  domain-level  views  are  not  maintained  by  every  router  but  by  special  nodes 
called  viewservers.  For  each  viewservei,  there  is  a  subset  of  domains  around  it,  referred  to  as  the 
viewserver’s  precinct.  The  viewserver  maintains  the  domain-level  view  of  its  precinct.  This  solves 
the  scaling  problem  for  storage  requirement. 

A  viewserver  can  provide  domain-level  source  routes  between  source  and  destination  nodes  in 
its  precinct.  Obtaining  a  domain-level  source  route  between  a  source  and  a  destination  that  are 
not  in  any  single  view,  involves  accumulating  the  views  of  a  sequence  of  viewservers.  To  make  this 
process  efficient,  viewservers  are  organized  hierarchically  in  levels,  and  an  associated  addressing 
structure  is  used.  Each  node  has  a  set  of  addresses.  Each  address  is  a  sequence  of  viewserver  ids  of 
decreasing  levels,  starting  at  the  top  level  and  going  towards  the  node.  The  idea  is  that  when  the 
views  of  the  viewservers  in  an  address  are  merged,  the  merged  view  contains  domain-level  routes 
to  the  node  from  the  top  level  viewservers.  (Addresses  are  obtained  from  name  servers  in  the  same 
wav  as  is  currently  done  in  the  Internet.) 

We  handle  dynamic  topology  changes  such  as  node/link  failures  and  repairs,  link  cost  changes, 
and  domain  partitions.  Gateways5  detect  domain-level  topology  changes  affecting  its  domain  and 
neighbor  domains.  For  each  domain,  there  is  a  reporting  gateway  that  communicates  these  changes 
by  flooding  to  the  viewservers  in  a  specified  subset  of  domains;  this  subset  is  referred  to  as  its  flood 
area.  Hence,  the  number  of  packets  used  during  flooding  is  proportional  to  the  size  of  the  flood 
area.  This  solves  the  scaling  problem  for  the  communication  requirement. 

Thus  our  inter-domain  routing  protocol  consists  of  two  subprotocols:  a  view-query  proto¬ 
col  between  routers  and  viewservers  for  obtaining  merged  views;  and  a  view-update  protocol 
between  gateways  and  viewservers  for  updating  domain-level  views. 

*  A  node  is  called  a.  gateway  if  it  has  a  link  to  another  domain. 
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Evaluation 

Many  inter-domain  routing  protocols  have  been  proposed,  based  on  various  kinds  of  hierarchies. 
How  do  these  protocols  compare  against  each  other  and  against  the  simple  approach?  To  answer  this 
question,  we  need  a  model  in  ■which  we  can  define  internetwork  topologies.  policy/ToS  constraints, 
inter-domain  routing  hierarchies,  and  evaluation  measures  (e.g.  memory  and  time  requirements) 
for  inter-domain  routing  protocols.  None  of  these  protocols  have  been  evaluated  in  a  way  that  they 
can  be  compared  against  each  other  or  the  simple  approach. 

In  this  paper,  we  present  such  a  model,  and  use  it  to  compare  our  viewserver  hierarchy  to  the 
simple  approach.  Our  evaluation  measures  are  the  amount  of  memory  required  at  the  source  and 
at  the  routers,  the  amount  of  time  needed  to  construct  a  path,  and  the  number  of  valid  paths 
found  (and  their  lengths)  in  comparison  to  the  number  of  available  valid  paths  (and  their  lengths) 
in  the  internetwork.  We  use  three  internetwork  topologies  each  of  size  11,110  domains  (roughly  the 
current  size  of  the  Internet).  Our  results  indicate  that  the  viewserver  hierarchy  finds  many  short 
valid  paths  and  reduces  the  amount  of  memory  requirement  by  two  orders  of  magnitude. 

Organization  of  the  paper 

In  Section  2,  we  survey  recent  approaches  to  inter-domain  routing.  In  Section  3,  we  present  the 
view-query  protocol  for  static  network  conditions,  that  is,  assuming  all  links  and  nodes  of  the 
network  remain  operational.  In  Section  4,  we  present  the  view-update  protocol  to  handle  topology 
changes  (this  section  is  not  needed  for  the  evaluation  part).  In  Section  5,  we  present  our  evaluation 
model  and  results  from  its  application  to  the  viewserver  hierarchy.  In  Section  6,  we  conclude  and 
describe  how  to  add  fault-tolerance  and  cacheing  schemes  to  improve  performance. 

2  Related  Work 

In  this  section,  we  survey  recently  proposed  inter-domain  routing  protocols  that  support  ToS  and 
Policy  routing  for  large  internetworks  [14,  16,  13,  10,  6,  20,  2, 19, 18,  7j. 

Several  inter-domain  routing  protocols  (e.g.  BGP  [14],  ID  HP  [16],  NIL  [10])  are  based  on  path- 
vector  approach  [17],  Here,  for  each  destination  domain  a  router  maintains  a  set  of  paths,  one 
through  each  of  its  neighbor  routers.  ToS  and  policy  information  is  attached  to  these  paths.  Each 
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router  requires  0(N d  x  Np  x  £r)  space.  For  each  destination,  a  router  exchanges  its  best  valid 
path6  with  its  neighbor  routers.  However,  a  path- vector  algorithm  may  not  find  a  valid  path 
from  a  source  to  the  destination  even  if  such  a  route  exists  [13]".  By  exchanging  k  paths  to  each 
destination,  the  probability  of  detecting  a  valid  path  for  each  source  can  be  increased. 

The  most  common  approach  to  solve  the  scaling  problem  is  to  use  superdomain s8  (e.g.  HDPR  [13], 
ID  IIP  [16],  Nimrod  [6]).  Superdomains  extend  the  idea  of  area  hierarchy  [11].  Here,  domains  axe 
grouped  hierarchically  into  superdomains:  “close”  domains  axe  grouped  into  level  1  superdomains, 
“dose”  level  1  superdomains  are  grouped  into  level  2  superdomains,  and  so  on.  Each  domain 
A  is  addressed  by  concatenating  the  superdomain  ids  starting  from  a  top  level  superdomain  and 
going  down  towards  A.  A  router  maintains  a  view  that  contains  the  domains  in  the  same  level  1 
superdomain,  the  level  1  superdomains  in  the  same  level  2  superdomain,  and  so  on.  Thus  a  router 
maintains  a  smaller  view  than  it  would  in  the  absence  of  hierarchy.  Each  superdomain  has  its  own 
ToS  and  policy  constraints  derived  from  that  of  the  subdomains. 

There  axe  several  major  problems  with  using  superdomains.  One  problem  is  that  if  there  are 
domains  with  different  (possibly  contradictory)  constraints  in  a  superdomain,  then  there  is  no  good 
way  of  deriving  the  ToS  and  policy  constraints  of  the  superdomain.  The  usual  techniques  are  to 
take  either  the  union  or  the  intersection  of  the  constraints  of  the  subdomains  [13].  Both  techniques 
have  problems9.  Other  problems  are  described  in  [6,  2].  Some  of  the  problems  can  be  relaxed  by 
having  overlapping  superdomains,  but  this  increases  the  storage  requirements  drastically. 

Nimrod  [6]  and  IDPR  [13]  use  the  link-state  approach,  domain-level  source  routing,  and  super¬ 
domains  (non-overlapping  superdomains  for  Nimrod).  IDItP  [16]  uses  path-vector  approach  and 
superdomains. 

Reference  [10]  combines  the  benefits  of  path-vector  approach  and  link-state  approach  by  having 
two  modes:  An  NR  mode,  which  is  an  extension  of  IDRP  and  is  used  for  the  most  common  ToS 
and  policy  constraints;  and  a  SDR  mode,  which  is  like  IDPR  and  is  used  for  less  frequent  ToS  and 

6  A  valid  path  is  a  path  that  satisfies  the  ToS  and  policy  constraints  of  the  domains  in  the  path. 

7  Fox  example,  suppose  a  Touter  u  has  two  paths  Pi  and  P2  to  the  destination.  Let  u  have  a  router  neighbor  t>, 
which  is  in  another  domain,  ti  chooses  and  informs  v  of  one  of  the  paths,  say  Pi.  Bnt  Pi  may  violate  source  policies 
of  ti’s  domain,  and  P2  may  be  a  valid  path  fox  v. 

*  also  reierred  to  as  ronting  domain  confederations 

*  For  example,  if  the  union  is  taken,  then  a  subdomain  A  can  be  forced  to  obey  constraints  of  other  subdomains; 
this  may  eliminate  a  path  through  A  which  is  otherwise  valid.  If  the  intersection  is  taken,  then  a  subdomain  A  can 
be  forced  to  accept  traffic  it  would  otherwise  not  accept. 
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policy  requests.  This  study  does  not  address  the  scalability  of  the  SDR  mode. 

In  [2],  we  proposed  another  protocol  based  on  superdomains.  It  always  finds  a  valid  path  if 
one  exists.  Both  union  and  intersection  policy  and  ToS  constraints  are  maintained  for  each  visible 
superdomain.  If  the  union  policy  constraints  of  superdomains  on  a  path  are  satisfied,  then  the  path 
is  valid.  If  the  intersection  policy  constraints  of  a  superdomain  axe  satisfied  but  the  union  policy 
constraints  axe  not,  the  source  uses  a  query  protocol  to  obtain  a  more  detailed  “internal”  view  of 
the  superdomain,  and  searches  again  for  a  valid  path.  The  protocol  uses  a  link-state  view  update 
protocol  to  handle  topology  changes,  including  failures  that  partition  superdomains  at  any  level. 

The  landmark  hierarchy  [19,  18]  is  another  approach  for  solving  the  scaling  problem.  Here, 
each  router  is  a  landmark  with  a  radius,  and  routers  which  are  within  a  radius  away  from  the 
landmark  maintain  a  route  to  it.  Landmarks  are  organized  hierarchically,  such  that  the  radius 
of  a  landmark  increases  with  its  level,  and  the  radii  of  top  level  landmarks  include  all  routers. 
Addressing  and  packet  forwarding  schemes  are  introduced.  Link-state  algorithms  can  not  be  used 
with  the  landmark  hierarchy,  and  a  thorough  study  of  enforcing  ToS  and  policy  constraints  with 
this  hierarchy  has  not  been  done. 

The  landmark  hierarchy  may  look  similar  to  our  viewserver  hierarchy,  but  in  fact  they  are  quite 
opposite.  In  the  landmark  hierarchy,  nodes  within  the  radius  of  the  landmark  maintain  a  route  to 
the  landmark,  and  the  landmark  may  not  have  a  route  to  these  nodes.  In  the  viewserver  hierarchy, 
viewserver  maintains  routes  (i.e.  a  view)  to  the  nodes  in  its  precinct. 

Route  fragments  [7]  is  an  addressing  scheme.  A  destination  route  fragment,  called  a  route 
suffix,  is  a  sequence  of  domain  ids  from  a  backbone  to  the  destination  domain.  A  source  route 
fragment,  called  a  route  prefix ,  is  the  reverse  of  a  route  suffix  of  that  domain.  There  are  also  route 
middles,  w’hich  are  from  transit  domains  to  transit  domains.  These  addresses  axe  static  (i.e.  they 
are  not  updated  with  topology  changes)  and  stored  at  the  name  servers.  A  source  queries  a  name 
server  and  obtains  destination  route  suffixes.  It  then  chooses  an  appropriate  route  suffix  for  the 
destination  and  concatenates  it  with  its  own  route  prefix  (and  uses  routes  middles  if  route  suffix 
and  route  prefix  do  not  intersect).  This  scheme  can  not  handle  topology  changes  and  does  not 
address  handling  policy  and  ToS  constraints. 
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3  Viewserver  Hierarchy  Query  Protocol 


Id  this  section,  we  present  our  scheme  for  static  network  conditions,  that  is,  all  links  and  nodes 
remain  operational.  The  dynamic  case  is  presented  in  Section  4. 

Conventions:  Each  domain  has  a  unique  id.  Domainlds  denotes  the  set  of  domain-ids.  Each 
node  has  an  id  which  is  unique  in  its  domain.  Nodelds  denotes  the  set  of  node-ids.  Thus,  a  node  is 
totally  identified  by  the  combination  of  its  domain’s  id  and  its  node-id.  Totallds  denotes  the  set 
of  total  node-ids.  For  a  node  u,  we  use  domainid(u )  to  denote  the  domain-id  of  it’s  domain.  We 
use  nodeidlu)  and  totalid(u)  to  denote  the  node-id  and  total-id  of  u  respectively.  For  a  domain  A, 
we  use  domainid(A)  to  denote  the  domain-id  of  A.  Node  Neighborly.)  denotes  the  set  of  node-ids 
of  the  neighbors  of  u.  DomainNeighbors(A )  denotes  the  set  of  domain-ids  of  the  domain  neighbors 
of  A.  We  use  the  term  gateway-id  (or  viewserver-id)  to  mean  the  total-id  of  a  gateway  node  (or  a 
viewserver  node). 

In  our  protocol,  a  node  u  uses  two  lands  of  sends.  The  first  kind  has  the  form  “Send(m)  to  r”, 
where  m  is  the  message  being  sent  and  v  is  the  total-id  of  the  destination.  Here,  nodes  v.  and  v 
axe  neighbors,  and  the  message  is  sent  over  the  physical  link  (u,t>).  If  the  link  is  down,  we  assume 
that  the  packet  is  dropped. 

The  second  kind  of  send  has  the  form  uSend(m)  to  v  using  dlsr1”,  where  m  and  v  axe  as  above 
and  dlsr  is  a  domain-level  source  route  between  u  and  v.  Here,  the  message  is  sent  using  the  intra¬ 
domain  routing  protocols  of  the  domains  in  dlsr  to  reach  r10.  We  assume  that  as  long  as  there  is  a 
sequence  of  up  links  connecting  the  domains  in  dlsr,  the  message  is  delivered  to  r11.  If  the  u  and 
v  axe  in  the  same  domain,  dlsr  equals  (). 

Views  and  Viewservers 

Domain-level  views  axe  maintained  by  special  nodes  called  viewservers.  Each  viewserver  has  a 
precinct ,  which  is  a  set  of  domains  around  the  viewserver,  and  a  static  view ,  which  is  a  domain-level 
view  of  the  precinct  and  outgoing  edges.  The  static  view  includes  the  ToS  and  policy  constraints 

10  Recall  that  gives  a  domain-level  source  route  to  a  destination,  using  the  intra-domain  routing  protocols  we  r-a* 
reach  the  destination. 

12  This  involves  time-outs,  retransmissions,  etc.  It  requires  a  transport  protocol  support  such  as  TCP. 
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of  domains  in  the  precinct  and  of  domain-level  edges12.  Formally,  a  viewserve:  z  maintains  the 
following: 

PTecincis.  (C  Domainlds).  Domain-ids  whose  view  is  maintained. 

SView-.  Static  view  of  x. 

=  {(j4,  policyktos(A),  {{5,  edgejpolicyhtos{A,  B)) :  B  €  subset  of  DomainNeighbors{A)}) 
A  £  Precincts } 

SViewx  can  be  implemented  as  adjacency  list  representation  of  graphs  [1].  The  intention  of 
SViewx  is  to  obtain  domain-level  source  routes  between  nodes  in  Precincts.  Hence,  the  choice  of 
domains  to  include  in  Precinct ~  and  the  choice  of  neighbors  of  domains  to  include  in  SViewx  is 
not  arbitrary.  Precincts  and  SVie wx  must  be  connected;  that  is,  between  any  two  domains  in 
Precincts,  there  should  be  a  path  in  SVietu*  that  lies  in  Precincts .  Note  that  SViewx  can  contain 
edges  to  domains  outside  Precincts ■  We  say  that  a  domain  A  is  in  the  view  of  a  viewserver  x,  if 
either  A  is  in  the  precinct  of  i  or  SViewx  has  an  edge  from  a  domain  in  precinct  to  A.  Note  that 
the  precincts  and  views  of  different  view  servers  can  be  over\appwa,  identical  or  disjoint. 

If  there  is  a  viewserver  x  whose  view  contains  both  the  source  and  the  destination  domains, 
then  s’s  view  can  be  used  to  obtain  the  required  domain-level  source  route  to  reach  the  destination. 
The  source  needs  to  reach  x  to  obtain  its  view.  If  the  source  and  x  are  in  the  same  domain,  x 
can  be  reached  using  the  intra-domain  routing  protocol.  If  x  is  in  another  domain,  then  the  source 
needs  to  have  a  domain-level  source  route  to  it13.  In  this  case,  we  assume  that  source  has  a  set  of 
faced  domain-level  source  routes  to  x. 

Viewserver  Hierarchy 

For  scaling  reasons,  we  cannot  have  one  large  view.  Thus,  obtaining  a  domain-level  source  route 
between  a  source  and  a  destination  which  are  far  away,  involves  accumulating  views  of  a  sequence  of 
viewservers.  To  keep  this  process  efficient,  we  organize  viewservers  hierarchically.  More  precisely, 
each  viewserver  is  assigned  a  hierarchy  level  from  0,1,...,  with  0  being  the  top  level  in  the  hierarchy. 
A  parent/child  relationship  between  viewservers  is  defined  as  follows: 

12  Not  ah  the  domain-level  edges  need  io  be  included.  This  is  because  some  domains  may  have  many  neighbors 
causing  a  big  storage  requirement. 

13  We  cannot  obtain  this  domain-level  sonrce  route  from  x,  i.e.  chicken-egg  problem. 
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1.  Every  level  i  viewservex,  i  >  0,  has  a  parent  viewserver  whose  level  is  less  than  i. 

2.  If  viewserver  z  is  a  parent  of  viewserver  y  then  x’s  view  contains  y’s  domain  and  y’s  view 
contains  z's  domain14. 

3.  The  view'  of  a  top  level  viewserver  contains  the  domains  of  all  other  top  level  viewservers. 
(typically,  top  level  viewservers  are  placed  in  backbones). 

Note  that  the  second  constraint  does  not  mean  that  all  top  level  viewservers  have  the  same  view. 
In  the  hierarchy,  a  parent  can  have  many  children  and  a  child  can  have  many  parents.  "We  extend 
the  range  of  the  parent-child  relationship  to  ordinary  nodes;  that  is  if  the  Precinct *  contains  the 
domain  of  node  u,  we  say  that  u  is  a  child  of  z,  and  x  is  a  parent  of  u  (note  that  an  ordinary  node 
does  not  have  a  child).  We  assume  that  there  is  at  least  one  parent  viewserver  for  each  node. 

For  a  node  it,  an  address  is  defined  to  be  a  sequence  (z0,x\,..  .,zt)  such  that  x;  for  i  <  t  is 
a  viewserver-id,  xo  is  a  top  level  viewserver-id,  zt  is  the  total-id  of  u,  and  x,-  is  a  parent  of  x,-+i . 
Note  that  anode  may  have  many  addresses  since  the  parent-child  relationship  is  many-to-many.  If 
a  source  wants  a  domain-level  source  route  to  a  destination,  it  first  queries  the  name  servers  15  to 
obtain  a  set  of  addresses  for  the  destination.  Then,  it  queries  viewservers  to  obtain  an  accumulated 
view  containing  both  its  domain  and  the  destination’s  domain. 

Querying  the  name  servers  can  be  done  the  same  way  it  is  done  currently  in  the  Internet.  It 
requires  nodes  to  have  a  set  of  fixed  addresses  to  name  servers.  This  is  also  sufficient  in  our  case. 
However,  we  can  improve  the  performance  by  having  a  set  of  fixed  domain-level  source  routes 
instead. 

View-Query  Protocol:  Obtaining  Domain-Level  Source  Routes 

We  now  describe  how  a  domain-level  source  route  is  obtained  (regardless  of  whether  the  source  and 
the  destination  are  in  a  common  view  or  not). 

We  want  a  sequence  of  viewservers  whose  merged  views  contains  both  the  source  and  the 
destination  domains.  Addresses  provide  a  way  to  obtain  such  a  sequence,  by  first  going  up  in 
the  viewserver  hierarchy  starting  from  the  source  node  and  then  going  down  in  the  viewserver 
hierarchy  towards  the  destination  node.  More  precisely,  let  (so,-..,St)  be  an  address  of  the  source, 

34  Note  that  z  and  y  do  not  have  to  be  in  each  other’s  precinct. 

35  In  fact,  name  servers  are  called  domain  name  servers.  However,  domain  names  and  the  domains  used  in  this 
paper  are  different.  We  use  “name  servers"  to  avoid  coninsion. 


and  (do, be  an  address  of  the  destination.  Then,  the  sequence  (sJ_1,...,so)d0, . 
meets  our  requirements.16  In  fact,  going  up  all  the  way  in  the  hierarchy  to  top  level  viewservers 
may  not  be  necessary.  We  can  stop  going  up  at  a  viewserver  s;  if  there  is  a  viewserver  dj,j  <  l 
such  that  the  domain  of  i;  is  in  the  view  of  s,  (one  special  case  is  where  s,  =  dj). 

The  view-query  protocol  uses  two  message  types: 

•  (RequestView,  sjaddress,  djaddress) 

where  s-address  and  djaddress  are  the  addresses  for  the  source  and  the  destination  respec¬ 
tively.  A  RequestView  message  is  sent  by  a  source  to  obtain  an  accumulated  view  containing 
both  the  source  and  the  destination  domains.  When  a  viewserver  receives  a  RequestView 
message,  it  either  sends  back  its  view  or  forwards  this  request  to  another  viewserver. 

•  (ReplyView,  sjaddress ,  djiddress ,  accumview) 

where  sjaddress  and  djaddress  are  as  above  and  accumview  is  the  accumulated  view.  A 
ReplyView  message  is  sent  by  a  viewserver  to  the  source  or  to  another  viewserver  closer  to 
the  source.  The  accumview  field  in  a  Reply  Vie  v  message  equals  the  union  of  the  views  of 
the  viewservers  the  message  has  visited. 

We  now  describe  the  events  of  a  source  node  (see  Figure  2).  The  source  node17  sends  a 
RequestView  packet  containing  a  source  and  a  destination  address  to  its  parent  in  the  source  ad¬ 
dress  (using  a  fixed  domain-level  source  route).  When  the  source  receives  a  ReplyView  packet,  it 
chooses  a  valid  path  using  the  accumview  in  the  packet.  If.it  does  not  find  a  valid  path,  it  can 
try  again  using  a  different  source  and/or  destination  address.  Note  that,  the  source  does  not  have 
to  throw  away  the  previous  accumulated  view’s,  but  merge  all  accumulated  views  into  a  richer  ac¬ 
cumulated  view.  In  fact,  it  is  easy  to  change  the  protocol  so  that  source  can  also  obtain  views  of 
individual  viewservers  to  make  the  accumulated  view  even  richer. 

The  events  of  a  viewserver  x  are  specified  in  Figure  3.  Upon  receiving  a  RequestView  packet, 
z  checks  if  the  destination  domain  is  in  its  precinct18.  If  it  is,  x  sends  back  its  view  in  a  ReplyView 
packet;  If  it  is  not,  x  forwards  the  request  packet  to  another  viewserver  as  follows:  x  checks  if  the 
domain  of  any  viewserver  in  the  destination  address  is  in  its  view  or  not.  If  there  is  such  a  domain, 

3t  This  is  similiax  to  matching  route  fragmenis[7).  However,  in  our  case  the  sequence  is  computed  in  a  distributed 
fashion  (these  is  needed  to  handle  topology  changes) . 

17  or  the  policy  server  in  the  source’s  domain 

11  Even  though  destination  can  be  in  the  view  of  x,  its  policies  and  ToS’s  are  not  in  the  view  if  it  is  not  in  the 
precinct  of  s. 
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Constants 


FizcdRoutcsu{z),  for  every  viewserver-id  r  such  that  z  is  a  parent  of  u, 

=  /  <05 

\  {{d\, .  ..,dn)  :  di  €  Domainlds}.  Set  of  domain-level  routes  to  z 


if  domainid(u)  =  domainid(z) 
otherwise 


Events 

RcqucslVicwu(s.addrcss,  djaddrcss )  {Executed  when  v  wants  a  valid  domain-level  source  route} 

Let  s-addrcss  be  (so, . .  .,Si_i,St),  and  dlsr  €  FixedRoutcsu(st-i); 

Send(RequestVies,  s-address,  djaddrcss)  to  st_i  using  dlsr 


Receive^  (Reply Vies,  sjaddrcss,  djiddrcss ,  accumview) 

Choose  a  valid  domain-level  source  route  using  accumview, 

If  a  valid  route  is  not  found 

Execute  RcqucstViewu  again  with  another  source  address  and/or  destination  address 


Figure  2:  View-query  protocol:  Events  and  state  of  a  source  u. 


Constants 

Precinctx.  Precinct  of  x. 

SVicwz .  Static  view  of  z. 

Events 

f£eceive-(RequestVie¥,  s.address,  d-addrcss ) 

Let  djiddrcss  be  (d0, . . . ,  dt); 
if  domainid(di)  £  Precinct^  then 

/ orwardx (Re que st Vies,  s .address,  djaddrcss,  {}); 
else  / orward- (ReplyViev,  d-addrcss,  sjaddrcss,  SViewx);  {addresses  are  switched} 

endif 

Receivc-(Itep lyView,  sjaddrcss ,  d-address,  view) 

f  orward- (Reply V i es,  sjaddress,  djaddrcss,  view  V  SViewz) 

where  procedure  j  orwardx{lype,  s .address ,  djaddrcss,  view ) 

Let  sjiddress  be  <s0 , . . . ,  st ) ,  duaddress  be  (d0 , . . . ,  di) ; 
if  Bi  :  domainid(di)  in  SView.  then 

Let  i  =  max{ j  :  domainid(dj)  in  SVjcu;-}; 
target  :=  d,; 

else  target  :=  s,  such  that  s<+i  =  iotalid(z); 
endif; 

dlsr  :=  choose  a  route  to  domainid(target )  from  domainid(z)  using  SViewx; 
if  type  =  RequestViev  then 

Send(RequestViev,  sjaddrcss,  djiddress )  to  target  using  dlsr; 
else  Send(ReplyView,  sjiddress,  djaddrcss,  view)  to  target  using  dlsr ; 
endif 


Figure  3:  View-query  protocol:  Events  and  state  of  a  viewserver  x. 
z  sends  the  RequestViev  packet  to  the  last  such  one.  Otherwise  x  is  a  viewserver  in  the  source 
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address  and  sends  the  packet  to  its  parent  in  the  source  address.  (Note  that  if  x  is  a  viewserver  in 
the  destination  address,  its  child  in  the  destination  address  is  definitely  in  its  view.) 

When  a  viewserver  x  receives  a  ReplyView  packet,  it  merges  its  view  to  the  accumulated  view 
in  the  packet.  Then  it  sends  the  ReplyView  packet  towards  the  source  node  same  way  it  would 
send  a  RequestView  packet  towards  the  destination  node  (i.e.  the  role  of  the  source  address  and 
the  destination  address  are  changed). 

Above  we  have  described  one  possible  way  of  obtaining  the  accumulated  views.  There  are 
various  other  possibilities,  for  example:  (1)  restricting  the  ReplyView  packet  to  take  the  reverse 
of  the  path  that  the  RequestView  packet  took;  (2)  having  ReplyView  packets  go  all  the  way 
up  in  the  viewserver-hierarchy  for  a  richer  accumulated  view;  (3)  source  polling  the  viewservers 
directly  instead  of  viewservers  forwarding  request/reply  messages  to  each  other;  (4)  not  including 
the  non-transit  stub  domains  other  than  the  source  and  the  destination  domains  in  the  accumview, 
(5)  including  some  source  policy  constraints  and  ToS  requirements  in  the  RequestView  packet, 
and  having  the  viewservers  filter  out  some  domains. 

4  Update  Protocol  for  Dynamic  Network  Conditions 

In  this  section,  we  first  examine  how  topology  changes  such  as  link /node  failures,  repairs,  and  cost 
changes,  map  into  domain-level  topology  changes.  Second,  we  describe  how  domain-level  topology 
changes  are  detected  and  communicated  to  viewservers,  i.e.  view-update  protocol.  Third,  we  modify 
the  view-query  protocol  appropriately. 

Mapping  Topology  Changes  to  Domain-Level  Topology  Changes 

f  Costs  are  associated  with  domain-level  edges.  The  cost  of  the  domain-level  edge  (A,  B)  equals  a 

vector  of  values  if  the  link  is  up;  each  cost  value  indicates  how  expensive  it  is  to  cross  domain  A 
to  reach  domain  B  according  to  some  criteria  such  as  delay,  throughput,  reliability,  etc.  The  cost 
equals  oo  if  all  links  from  A  to  B  are  down19.  Each  cost  value  of  a  domain-level  edge  (A,B)  can 
be  derived  from  the  cost  values  of  the  intra-domain  routes  in  A  and  links  from  AtoB  [4]20. 

J®  Note  tint  if  a  gateway  connecting  A  to  B  is  down,  its  Jinks  are  also  considered  to  be  down. 

20  For  example,  the  delay  of  a  domain-level  edge  {A,B)  can  be  calculated  as  the  maximum/ average  delay  of  the 
routes  from  any  gateway  in  .A  to  first  gateway  in  B. 
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Link  cost  changes  and  link/node  failures  and  repairs  correspond  to  cost  changes,  failures  and 
repairs  of  domain- level  edges.  Link /node  failures  can  also  partition  a  domain  into  cells[15].  A  cell 
is  a  maximal  subset  of  nodes  of  a  domain  that  can  reach  each  other  without  leaving  the  domain. 
With  partitioning,  some  nodes  as  well  as  some  neighbor  domains  may  not  be  accessible  by  all 
cells.  In  the  same  way,  link/node  repairs  may  merge  cells  into  bigger  cells.  We  identify  a  cell 
with  the  minimum  node-id  of  the  gateways  in  the  cell.  21  In  this  paper,  for  uniformity  we  treat 
an  unpartitioned  domain  as  a  domain  with  one  cell;  we  do  not  consider  cells  that  do  not  isolate 
gateways  since  such  cells  do  not  affect  inter-domain  routes. 

If  a  domain  gets  partitioned,  its  vertex  in  the  domain-level  views  should  be  split  into  as  many 
pieces  as  there  are  cells.  And  when  the  cells  merge,  the  corresponding  vertices  should  be  merged 
as  well. 

Since  a  domain  can  be  partitioned  into  many  cells,  domain-level  source  routes  now  include  cell- 
ids  as  well.  Hence,  the  intra-domain  routing  protocol  of  a  domain  should  include  a  route  to  each 
reachable  neighbor  domain  cell.22 

View-Update  Protocol:  Updating  Domain-Level  Views 

Viewservers  do  not  communicate  with  each  other  to  maintain  their  views.  Gateways  detect  and 
communicate  domain-level  topology  changes  to  viewservers.  Each  gateway  periodically  (and  op¬ 
tionally  after  a  change  in  the  intra-domain  routing  table)  inspects  its  intra-domain  routing  table 
and  determines  the  cell  it  belongs.  For  each  cell,  only  the  gateway  whose  node-id  is  the  cell-id 
(i.e.  the  gateway  with  the  minimum  node-id)  is  responsible  for  communicating  domain-level  topol¬ 
ogy  changes.  We  refer  to  this  gateway  as  the  reporting  gateway.  Reporting  gateways  compute 
the  domain-level  edge  costs  for  each  neighbor  domain  cell,  and  report  them  to  parent  viewservers. 
They  are  also  responsible  for  informing  the  viewservers  of  the  creation  and  deletion  of  cells. 

The  communication  between  a  reporting  gateway  and  viewservers  is  done  by  hooding  over  a 
set  of  domains.  This  set  is  referred  to  as  the  flood  area23.  The  topology  of  a  flood  area  must 

25  Our  cells  are  like  the  domain  components  of  IDPEJ13]. 

22  This  involves  the  following  changes  in  the  intra-domain  routing  protocol-  (1)  Whenever  the  cell-id  of  a  gateway 
changes,  it  reports  its  new  cell-id  to  adjacent  gateways  in  neighbor  domains.  When  they  receive  this  information, 
they  update  their  intra-domain  routes  to  include  the  new  cell-id.  (2)  Usually  when  a  node  recovers  from  a  failure, 
it  queries  its  neighbors  in  its  domain  for  their  intra-domain  routes.  When  a  gateway  recovers,  it  should  also  query 
adjacent  gateways  in  neighbor  domains  for  their  cell-ids. 

23  For  efficiency,  the  flood  area  can  be  implemented  by  a  radius  and  some  forwarding  limits  (e.g.  do  not  flood 
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be  a  connected  graph.  Due  to  the  nature  of  flooding,  a  viewserver  can  receive  information  out  of 
order  for  a  domain  cell.  In  order  to  avoid  old  information  replacing  new  information,  each  gateway 
includes  successively  increasing  time  stamps  in  the  messages  it  sends. 

Due  to  node  and  link  failures,  communication  between  a  reporting  gateway  and  a  viewserver 
can  fail,  resulting  in  the  viewserver  having  out-of-date  information.  To  eliminate  such  information, 
a  viewserver  deletes  any  information  about  a  domain  cell  if  it  is  older  than  a  time-to-die  period.  We 
assume  that  gateways  send  messages  more  often  than  the  time-to-die  value  (to  avoid  false  removal). 

When  a  viewserver  learns  of  a  new  domain  cell,  it  adds  it  to  its  view.  To  avoid  adding  a  domain 
cell  which  was  just  deleted24,  when  a  viewserver  receives  a  delete  domain  cell  request,  it  only  marks 
the  domain  cell  as  deleted  (and  removes  the  entry  after  the  time-to-die  period). 

The  view-update  protocol  uses  two  types  of  messages  as  follows: 

•  (UpdateCell,  domainid,  cellid,  timestamp ,  floodarea,  ncostset ) 

is  sent  by  the  reporting  gateway  to  inform  the  viewservers  about  current  domain-level  edge 
costs  of  its  cell.  Here,  domainid ,  cellid,  and  timestamp  indicate  the  domain,  the  cell  and  the 
time  stamp  of  the  reporting  gateway,  ncostset  contains  a  cost  for  each  neighbor  domain  cell, 
and  floodarea  is  the  set  of  domains  that  this  message  is  to  be  sent  over. 

•  (DeleteCell,  domainid,  cellid,  timestamp,  floodarea) 

where  the  parameters  are  as  in  the  UpdateCell  message.  It  is  sent  by  a  reporting  gateway 
when  it  becomes  non-reporting  (because  its  cell  expanded  to  include  a  gateway  with  lower 
id),  to  inform  viewservers  to  delete  the  gateway’s  old  cell. 

The  state  maintained  by  a  gateway  g  is  listed  in  Figure  4.  Note  that  LocalViewserverss  and 
LocalGatewayss  can  be  empty.  JntraDomainRTs  contains  a  route  (next-hop  or  source)  for  every 
reachable  node  of  the  domain  and  for  every  reachable  neighbor  domain  cell25.  We  assume  that 
consecutive  reads  of  Clockg  returns  increasing  values. 

The  state  maintained  by  a  viewserver  x  is  listed  in  Figure  5.  DViews  is  the  dynamic  part  of 
x’s  view.  For  each  domain  cell26  known  to  x,  DViewx  stores  a  timestamp  field  which  equals  the 

beyond  backbones)  instead  of  a  set. 

2<  If  the  domain  cell  was  removed,  the  timestamp  foi  that  domain  cell  is  also  lost. 

2i  IntraDomainUTf  is  a  view  in  case  of  a  link-state  rooting  protocol  or  a  distance  table  in  case  of  a  distance- vector 
rooting  protocol. 

26  We  use  A:g  to  denote  the  cell  g  of  domain  A. 
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Constants: 

LocalViewscrverSj .  (C  Totallds).  Set  of  viewservers  in  p's  domain. 
LocalGateways,.  (C  Totallds).  Set  of  gateways  in  p’s  domain  excluding  p. 
AdjForeignGatewaySg.  (C  Totallds).  Set  of  adjacent  gateways  in  other  domains. 
FloodArta,.  (C  Domaialds).  The  flood  area  of  the  domain  (includes  domain  of  p). 
Variables: 

IntraDomainRTg .  Intra-domain  routing  table  of  p.  Initially  contains  no  entries. 
Cellld,  :  lodelds.  The  id  of  p’s  cell.  Initially  =  oo 
Clock,  :  Integer.  Clock  of  g. 

Figure  4:  State  of  a  gateway  g. 


Constants: 

Precinct-.  Precinct  of  x. 

SVie wx.  Static  view  of  x. 

TimeToDiex  :  Integer.  Time-to-die  value. 

Variables: 

DVie wx.  Dynamic  view  of  x. 

=  {(A:g,  timestamp,  expirytime,  deleted, 

{{B:h,  cost)  :  B  €  Doma:nNeighbors(A)  A  h  €  Bodelds  U  {*}  })  : 

A  €  Precinct-  A  p  €  Bodelds} 

Clock-  :  Integer.  Clock  of  x. 

Figure  5:  State  of  a  viewserver  x. 

largest  timestamp  received  for  this  domain  cell,  an  expirytime  field  which  equals  the  end  of  the 
time-to-die  period  for  this  domain  cell,  a  deleted  field  which  marks  whether  or  not  the  domain  cell 
is  deleted,  and  a  cost  set  which  indicates  a  cost  for  every  neighbor  domain  cell  whose  domain  is  in 
SViews.  The  cell-id  of  a  neighbor  domain  equals  *  if  no  cell  of  the  neighbor  domain  is  reachable. 
The  events  of  gateway  g  and  a  viewserver  x  are  specified  in  Appendix  A. 

Changes  to  View-Query  Protocol 

We  now  enumerate  the  changes  needed  to  adapt  the  view- query  protocol  to  the  dynamic  case  (the 
formal  specification  is  omitted  for  space  reasons). 

Due  to  link  and  node  failures,  RequestViev  and  ReplyView  packets  can  get  lost.  Hence,  the 
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source  may  never  receive  a  Reply Vies  packet  after  it  initiates  a  request.  Tims,  the  source  should 
try  again  after  a  time-out  period. 

When  a  viewserver  receives  a  RequestViev  message,  in  the  static  case  it  replies  with  its  view 
if  the  destination  domain  is  in  its  precinct.  Now,  because  domain-level  edges  can  fail,  it  must  also 
check  its  dynamic  view  and  reply  with  its  views  only  if  its  dynamic  view  contains  a  path  to  the 
destination.  Similarly  during  forwarding  of  RequestViev  and  ReplyViev  packets,  a  viewserver, 
while  checking  whether  a  domain  is  in  its  view,  should  also  check  if  its  dynamic  view  contains  a 
path  to  it. 

Finally’,  when  a  viewserver  sends  a  message  to  a  node  whose  domain  is  partitioned,  it  should 
send  a  copy  of  the  message  to  each  cell  of  the  domain.  This  is  because  a  viewserver  does  not  know 
which  cell  contains  the  node. 

5  Evaluation 

Many  inter-domain  routing  protocols  have  been  proposed,  based  on  various  kinds  of  hierarchies. 
How  do  these  protocols  compare  against  each  other  and  against  the  simple  approach?  To  answer  this 
question,  we  need  a  model  in  which  we  can  define  internetwork  topologies,  policy /ToS  constraints, 
inter-domain  routing  hierarchies,  and  evaluation  measures  (e.g.  memory  and  time  requirements) 
for  inter-domain  routing  protocols. 

In  this  section,  we  first  present  such  a  model,  and  then  use  the  model  to  evaluate  our  viewserver 
hierarchy  and  compare  it  to  the  simple  approach.  Our  evaluation  measures  are  the  amount  of 
memory  required  at  the  source  and  at  the  routers,  the  amount  of  time  needed  to  construct  a  path, 
and  the  number  of  paths  found  out  of  the  total  number  of  valid  paths. 

Even  though  the  model  described  here  can  be  applied  to  other  inter-domain  routing  protocols, 
we  have  not  done  so,  and  hence  have  not  compared  them  against  our  viewserver  hierarchy.  This 
is  because  of  lack  of  time,  and  because  precise  definitions  of  the  hierarchies  in  these  protocols  is 
not  available.  For  example,  to  do  a  fair  evaluation  of  IDPIt[13],  we  need  precise  guidelines  for 
how  to  group  domains  into  super-domains,  and  how  to  choose  between  the  union  and  intersection 
methods  when  defining  policy/ToS  constraints  of  super-domains.  In  fact,  these  protocols  have  not 
been  evaluated  in  a  way  that  we  can  compare  them  to  the  viewserver  hierarchy.  To  the  best  of  our 
knowledge,  this  paper  is  the  first  to  evaluate  a  hierarchical  inter-domain  routing  protocol  against 
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explicitly  stated  policy  constraints. 


5.1  Evaluation  Model 

We  first  describe  our  method  of  generating  topologies  and  policy/ToS  constraints.  We  then  describe 
the  evaluation  measures. 

Generating  Internetwork  Topologies 

For  our  purposes,  an  internetwork  topology  is  a  directed  graph  where  the  nodes  correspond  to 
domains  and  the  edges  correspond  to  domain-level  connections.  However,  an  arbitrary  graph  will 
not  do.  The  topology  should  have  the  characteristics  of  a  real  internetwork,  like  the  Internet.  That 
is,  it  should  have  backbones,  regionals,  MANS,  LANS,  etc.;  these  should  be  connected  hierarchically 
(e.g.  regionals  to  backbones),  but  “non-hierarchical”  connections  (e.g.  “back-doors”)  should  also 
be  present. 

For  brevity,  we  refer  to  backbones  as  class  0  domains,  regionals  as  class  1  domains,  metropolitan- 
area  domains  and  providers  as  class  2  domains,  and  campus  and  local-area  domains  as  class  3 
domains.  A  (strictly)  hierarchical  interconnection  of  domains  means  that  class  0  domains  are 
connected  to  each  other,  and  for  :  >  0,  class  i  domains  are  connected  to  class  i  —  1  domains. 
As  mentioned  above,  we  also  want  some  “non-hierarchical”  connections,  i.e.,  domain-level  edges 
between  domains  irrespective  of  their  classes  (e.g.  from  a  campus  domain  to  another  campus 
domain  or  to  a  backbone  domain). 

In  reality,  domains  span  geographical  regions  and  domain-level  edges  are  usually  between  do¬ 
mains  that  are  geographically  close  (e.g.  University  of  Maryland  campus  domain  is  connected  to 
SURANET  regional  domain  which  is  in  the  east  cost).  A  class  i  domain  usually  spans  a  larger 
geographical  region  than  a  class  i  +  1  domain.  To  generate  such  interconnections,  we  associate  a 
“region”  attribute  to  each  domain.  The  intention  is  that  two  domains  with  the  same  region  are 
geographically  dose. 

The  region  of  a  class  i  domain  has  the  form  ro-rj.  •••  j-i,  where  the  r^-’s  are  integers.  For 
example,  the  region  of  a  dass  3  domain  can  be  1.2. 3. 4.  For  brevity,  we  refer  to  the  region  of  a 
dass  i  domain  as  a  dass  i  region. 

Note  that  regions  have  their  own  hierarchy.  Class  0  regions  are  the  top  level  regions.  We  say 
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thai  a  class  i  region  ro.rj.  ••  -  .rj  is  contained  in  the  ciass  t  -  1  region  xo-rj.  •  •  -  .Xi_i  (where  :  >  0). 
Containment  is  transitive.  Thus  region  1.2. 3. 4  is  contained  in  regions  1.2.3,  1.2  and  1. 
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Figure  6:  Regions 

Given  any  pair  of  domains,  we  classify  them  as  local,  remote  ox  fax,  based  on  their  regions. 
Let  X  be  a  class  i  domain  and  Y  a  class  j  domain,  and  (without  loss  of  generality)  let  i  <  j. 
X  and  y  are  local  if  they  axe  in  the  same  class  i  region.  For  example  in  Figure  6,  A  is  local  to 
B ,C,J,K,M,N,0,P,  and  Q.  X  and  Y  are  remote  if  they  axe  not  in  the  same  class  i  region  but 
they  are  in  the  same  class  —  1  region,  or  if  :  =  0.  For  example  in  Figure  6,  some  of  the  domains 
A  is  remote  to  are  D,E,  F,  and  L.  X  and  Y  axe  far  if  they  are  not  local  or  remote.  For  example 
in  Figure  6,  A  is  far  to  I. 

We  refer  to  a  domain-level  edge  as  local  ( remote ,  or  far)  if  the  two  domains  it  connects  axe  local 


(remote,  or  far). 

We  use  the  following  procedure  to  generate  internetwork  topologies: 

•  We  first  specify  the  number  of  domain  classes,  and  the  number  of  domains  in  each  class. 

•  We  next  specify  the  regions.  Note  that  the  number  of  region  classes  equals  the  number  of 
domain  classes.  We  specif}'  the  number  of  class  0  regions.  For  each  class  i  >  0,  we  specify  a 
branching  factor ,  which  creates  that  many  class  i  regions  in  each  class  i  —  1  region.  (That  is, 
if  there  are  two  class  0  regions  and  the  class  1  branching  factor  equals  three,  then  there  are 
six  class  1  regions.) 

•  For  each  class  i,  we  randomly  map  the  class  :  domains  into  the  class  i  regions.  Note  that 
several  domains  can  be  mapped  to  the  same  region,  and  some  regions  may  have  no  domain 
mapped  into  them. 

•  For  every  class  :  and  every  class  j,  j  >  i,  we  specify  the  number  of  local,  remote  and  far 
edges  to  be  introduced  between  class  i  domains  and  class  j  domains.  The  end  points  of  the 
edges  are  chosen  randomly  (within  the  specified  constraints). 

We  ensure  that  the  internetwork  topology  is  connected  by  ensuring  that  the  subgraph  of  class 
0  domains  is  connected,  and  each  class  i  domain,  for  i  >  0,  is  connected  to  a  local  class  t  —  1 
domain. 

Choosing  Policy /ToS  Constraints 

We  chose  a  simple  scheme  to  model  Policy /ToS  constraints.  Each  domain  is  assigned  a  color:  green 
or  red.  For  each  domain  class,  we  specify'  the  percentage  of  green  domains  in  that  class,  and  then 
randomly  choose  a  color  for  each  domain  in  that  class. 

A  valid  route  from  a  source  to  a  destination  is  one  that  does  not  visit  any  red  intermediate  do¬ 
mains;  the  source  and  destination  are  allowed  to  be  red.  Notice  that  this  models  transit  policy /ToS 
constraints.  We  are  working  on  extending  this  model  to  source  policy/ToS  constraints. 

Computing  Evaluation  Measures 

The  evaluation  measures  of  most  interest  foT  an  inter- domain  routing  protocol  are  its  memory  and 
time  requirements,  and  the  number  of  valid  paths  it  finds  (and  their  lengths)  in  comparison  to 
the  number  of  available  valid  paths  (and  their  lengths)  in  the  internetwork  (e.g.  could  it  find  the 
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shortest  valid  path  in  the  internetwork). 

The  only  analysis  method  we  have  at  present  is  to  numerically  compute  the  evaluation  measures 
lor  a  variety  of  source-destination  pairs.  Because  we  use  internetwork  topologies  of  large  sizes,  it  is 
not  feasible  to  compute  for  all  possible  source-destination  pairs.  We  randomly  choose  a  set  of  source- 
destination  pairs  that  satisfy  the  following  conditions:  (1)  the  source  and  destination  domains  are 
different,  and  (2)  there  exists  a  valid  path  from  the  source  domain  to  the  destination  domain  in 
the  internetwork  topology.  (Note  that  the  simple  scheme  would  always  find  such  a  path.) 

For  a  source-destination  pair,  we  refer  to  the  length  of  the  shortest  valid  path  in  the  internetwork 
topology  as  the  shortest-path  length.  Since  the  number  of  paths  between  a  source-destination  pair 
is  potentially  very  large  (factorial  in  the  number  of  domains),  and  we  are  not  interested  in  the 
paths  that  are  too  long,  we  only  count  the  number  of  paths  whose  lengths  are  not  more  than  the 
shortest-path-length  plus  2. 

The  evaluation  measures  described  above  are  protocol  independent.  However,  there  are  also 
important  evaluation  measures  that  are  protocol  dependent  (e.g.  number  of  levels  traversed  in 
some  particular  hierarchy).  Because  of  this  we  postpone  the  precise  definitions  of  the  evaluation 
measures  to  the  next  subsection  (their  definition  is  dependent  of  viewserver  hierarchy). 

5.2  Application  to  Viewserver  Protocol 

We  have  used  the  above  model  to  evaluate  our  viewserver  protocol  for  several  different  viewserver 
hierarchies  and  query  methods.  We  first  describe  the  different  viewserver  schemes  evaluated.  Please 
refer  to  Figure  6  in  the  following  discussion. 

The  first  viewserver  scheme  is  referred  to  as  base.  It  has  exactly  one  viewserver  in  each  domain. 
Each  viewserver  is  identified  by  its  domain-id.  The  domains  in  a  viewserver’s  precinct  consist  of 
its  domain  and  the  neighboring  domains.  The  edges  in  the  viewserver’s  view  consist  of  the  edges 
between  the  domains  in  the  prednct,  and  edges  outgoing  from  domains  in  the  precinct  to  domains 
not  in  the  prednct.  For  example,  the  prednct  of  viewserver  A  (i.e.  the  viewserver  in  domain  A) 
consists  of  domains  A,B,J ;  the  edges  in  the  view  of  viewserver  A  consists  of  domain-level  edges 
(A,  P),  (A,  J),  (B,  J),  (J,  M),  ( J,  K),  ( J ,  F ),  and  (/,!?). 

As  for  the  viewserver  hierarchy,  a  viewserver’s  level  is  defined  to  be  the  dass  of  its  domain.  That 
is,  a  viewserver  in  a  dass  i  domain  is  a  level  i  viewserver.  For  each  level  i  viewserver,  i  >  0,  its 
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parent  viewserver  is  chosen  randomly  from  the  level :  - 1  viewservers  in  the  parent  region  such  that 
there  is  a  domain-level  edge  between  the  viewservers  domain  and  the  parent  viewserver’s  domain. 
For  example,  for  viewserver  C,  we  can  pick  viewserver  J  or  K;  suppose  we  pick  J.  For  viewserver 
J,  we  have  no  choice  but  to  pick  M  (N  and  0  are  not  connected  to  J).  For  M,  we  pick  P  (out  of 
P  and  Q). 

We  use  only  one  address  for  each  domain.  The  viewserver-address  of  a  stub  domain  is  con¬ 
catenation  of  four  viewserver  (i.e.  domain)  ids.  Thus,  the  address  of  A  is  P.M.J.A.  Similarly,  the 
address  of  E  is  P.M.K.H.  To  obtain  a  route  between  A  and  E,  it  suffices  to  obtain  views  of 
viewservers  A,  J,  K,  E. 

The  second  viewserver  scheme  is  referred  to  as  base-QT  (where  the  QT  stands  for  “query  upto 
top”).  It  is  identical  to  base  except  that  during  the  query  protocol  all  the  viewservers  in  the  source 
and  the  destination  addresses  are  queried.  That  is,  to  obtain  a  route  between  A  and  E,  the  views 
of  A,J,M,P,E,E  are  obtained. 

The  third  viewserver  scheme  is  referred  to  as  locals.  It  is  identical  to  base  except  that  now  a  • 
viewserver’s  precinct  also  contains  domains  that  have  the  same  region  as  the  viewserver’s  domain. 
That  is,  the  precinct  of  viewserver  A  has  the  domains  A,B,J,C.  Note  that  in  this  scheme  a 
viewserver’s  view  is  not  necessarily  connected.  For  example,  if  the  edge  ( C,J )  is  removed,  the  view 
of  viewserver  A  is  no  longer  connected.  (In  Section  3,  we  said  that  the  view  of  a  viewserver  should 
be  connected.  Here  we  have  relaxed  this  condition  to  simplify  testing.) 

The  fourth  viewserver  scheme  is  referred  to  as  locals-QT.  It  is  identical  to  locals  except  that 
during  the  query  protocol  all  the  viewservers  in  the  source  and  the  destination  addresses  are  queried. 

The  fifth  viewserver  scheme  is  referred  to  as  vertex-extension.  It  is  identical  to  base  except 
that  viewserver  precincts  are  extended  as  follows:  Let  P  denote  the  precinct  of  a  viewserver  in  the 
base  scheme.  For  each  domain  X  in  P,  if  there  is  an  edge  from  domain  X  to  domain  Y  and  Y 
is  not  in  P,  domain  Y  is  added  to  the  precinct;  among  y’s  edges,  only  the  ones  to  domains  in  P 
are  added  to  the  view.  In  the  example,  domains  M,  K,  F,  D  are  added  to  the  precinct  of  A ,  but 
outgoing  edges  of  these  domains  to  other  domains  are  not  included  (e.g.  ( F,G )  is  not  included). 
The  advantage  of  this  scheme  is  that  even  though  it  increases  the  precinct  size  by  a  factor  which 
is  potentially  greater  than  2,  it  increases  the  number  of  edges  stored  in  the  view  by  a  factor  less 
than  2.  (In  fact,  if  the  same  edge  cost  and  edge  policies  are  used  for  both  directions  of  domain- 


level  edges,  then  the  only  other  information  that  needs  to  be  stored  by  the  viewservers  is  the  policy 
constraints  of  the  newly  added  domains.) 

The  sixth  viewserver  scheme  is  referred  to  as  full-QT.  It  is  constructed  in  the  same  way  as 
vertex-extension  except  that  the  locals  scheme  is  used  instead  of  base  scheme  to  define  the  P  in 
the  construction.  In  full-QT,  during  the  query  protocol  all  the  viewservers  in  the  source  and  the 
destination  addresses  are  queried. 

In  all  the  above  viewserver  schemes,  we  have  used  the  same  hierarchy  for  both  domain  classes 
and  viewservers.  In  practice,  not  all  domains  need  to  have  a  viewserver,  and  a  viewserver  hierarchy 
different  from  the  domain  class  hierarchy  can  be  deployed.  However,  there  is  an  advantage  of 
having  a  viewserver  in  each  domain;  that  is,  source  nodes  do  not  require  fixed  domain-level  source 
routes  to  their  parent  viewservers  (in  the  view-query  protocol).  This  reduces  the  amount  of  hand 
configuration  required.  In  fact,  the  base  scheme  does  not  require  any  hand  configuration,  viewservers 
can  decide  their  precincts  from  the  intra-domain  routing  tables,  and  nodes  can  use  intra-domain 
routes  to  reach  parent  viewservers. 

Results  for  Internetwork  1 

The  parameters  of  the  first  internetwork  topology,  referred  to  as  Internetwork  1,  are  shown  in 
Table  1. 

Our  evaluation  measures  were  computed  for  a  (randomly  chosen  but  fixed)  set  of  1000  source- 
destination  pairs.  For  brevity,  we  use  spl  to  refer  to  the  shortest-path  length  (i.e.  the  length  of 
the  shortest  valid  path  in  the  internetwork  topology).  The  minimum  spl  of  these  pairs  was  2,  the 
maximum  spl  was  13,  and  the  average  spl  was  6.8.  Table  2  lists  for  each  viewserver  scheme  (1)  the 
minimum,  average  and  maximum  precinct  sizes,  (2)  the  mini-mum,  average  and  maximum  merged 
view  sizes,  and  (3)  the  minimum,  average  and  maximum  number  of  viewservers  queried. 

The  precinct  size  indicates  the  memory  requirement  at  a  viewserver.  More  precisely,  the  memory 
requirement  at  a  viewserver  is  0 (precinct  size  x  d)  where  d  is  the  average  number  of  neighbor 
domains  of  a  domain,  except  for  the  vertex-extension  and  full-QT  schemes.  In  these  schemes,  the 
memory  requirement  is  increased  by  a  factor  less  than  two.  Hence  the  vertex-extension  scheme  has 
the  same  order  of  viewserver  memory  requirement  as  the  base  scheme  and  the  full-QT  scheme  has 

27  Brandling  factor  is  4  for  a]]  region  classes. 
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Table  1:  Parameters  of  Internetwork  1. 


Scheme 

Precinct  Size 

Merged  View  Size 

No.  of  Viewservers  Queried 

base 

2  /  3.2  /  68 

7  /  71.03  /  101 

3  /  7.51  /  8 

base-QT 

2  /  3.2  /  68 

30  /  76.01  /  101 

8  /  8.00  /  8 

locals 

2  /  52.0  /  103 

3  /  95.40  /  143 

2  /  7.42  /  8 

locals- QT 

2  /  52.0  /  103 

43  /  101.86  /  143 

8  /  8.00  /  8 

vertex-extension 

3  /  19.2  /  796 

23  /  362.15  /  486 

3  /  7.51  /  8 

full-QT 

11  /  102.9  /  796 

228  /  396.80  /  519 

8  /  8.00  /  8 

Table  2:  Precinct  sizes,  merged  view  sizes,  and  number  of  viewservers  queried  fox  Internetwork  1. 


the  same  order  of  viewserver  memory  requirement  as  the  locals  scheme. 

The  merged  view  size  indicates  the  memory  requirement  at  a  source;  i.e.  the  memory  require¬ 
ment  at  a  source  is  (9 (merged  view  size  x  d)  except  for  the  vertex-extension  and  full-QT  schemes. 
Note  that  the  source  does  not  need  to  store  information  about  red  and  non-transit  domains.  The 
numbers  in  Table  2  take  advantage  of  this. 

The  number  of  viewservers  queried  indicates  the  communication  time  required  to  obtain  the 
merged  view  at  the  source.  Because  the  average  spl  is  6.8,  the  “real-time”  communication  time 
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required  to  obtain  the  merged  view  at  a  source  is  slightly  more  than  one  round-trip  time  between 
the  source  and  the  destination. 

As  is  apparent  from  Table  2,  using  a  QT  scheme  increases  the  merged  view  size  and  the  number 
of  viewservers  queried  only  by  about  5%.  Using  a  locals  scheme  increases  the  merged  view  size 
by  about  30%.  Using  the  vertex-extension  scheme  increases  the  merged  view  size  by  5  times  (note 
that  the  amount  of  actual  memory  needed  increases  only  by  a  factor  less  than  2).  The  number  of 
viewservers  queried  in  the  locals  scheme  is  less  than  the  number  of  viewservers  queried  in  the  base 
scheme.  This  is  because  the  viewservers  in  the  locals  scheme  have  bigger  precincts,  and  a  path  from 
the  source  to  the  destination  can  be  found  using  fewer  views. 

Table  3  shows  the  average  number  of  spl,  spl  +  1,  spl  +  2  length  paths  found  for  a  source- 
destination  pair  by  the  simple  approach  and  by  the  viewserver  schemes.  All  the  viewserver  schemes 
are  very  close  to  the  simple  approach.  The  vertex-extension  and  full-QT  schemes  are  especially  close 
(they  found  98%  of  all  paths).  Table  3  also  shows  the  number  of  pairs  for  which  the  viewserver 
schemes  did  not  find  a  path  (ranging  from  1.4%  to  5.9%  of  the  source-destination  pairs),  and 
the  number  of  pairs  for  which  the  viewserver  schemes  found  longer  paths.  For  these  pairs,  more 
viewserver  addresses  need  to  be  tried.  Note  that  the  locals  and  vertex-extension  schemes  decrease  the 
number  of  these  pairs  substantially  (adding  (JT  yields  further  improvement).  Our  policy  constraints 
are  source  and  destination  domain  independent.  Hence,  even  a  class  2  domain,  if  it  is  red,  can  not 
carry  traffic  to  a  class  3  domain  to  which  it  is  connected.  We  believe  that  these  figures  would 
improve  with  policies  that  are  dependent  on  source  and  destination  domains. 

As  is  apparent  from  Table  3  and  Table  2,  the  locals  scheme  does  not  find  many  more  extra 
paths  than  the  base  scheme  even  though  it  has  larger  precinct  and  merged  view  sizes.  Hence  it  is 
not  recommended.  The  vertex-extension  scheme  is  the  best,  but  even  base  is  adequate  since  it  finds 
many  paths. 

We  have  repeated  the  above  evaluations  for  two  other  internetworks  and  obtained  similar  con¬ 
clusions.  The  results  are  in  Appendix  B. 

6  Concluding  Remarks 

We  presented  hierarchical  inter-domain  routing  protocol  that  (1)  satisfies  policy  and  ToS  con¬ 
straints.  (2)  adapts  to  dynamic  topology  changes  including  failures  that  partition  domains,  and 
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Scheme 

Num' 

spl 

3er  of  pat 

spl  4- 1 

hs  found 

spl  +  2 

No.  of  pairs 

with  no  path 

No.  of  pairs 

with  longer  paths 

simple 

2.51 

18.48 

131.01 

N/A 

N/A 

base 

2.41 

15.84 

99.42 

59 

3  by  1.33  hops 

base-QT 

2.41 

15.86 

200.16 

54 

3  by  1.33  hops 

locals 

2.41 

16.17 

103.54 

29 

3  by  1  hop 

locals- QT 

2.41 

16.29 

105.02 

20 

3  by  1  hop 

vertex- extension 

2.51 

18.38 

128.19 

22 

0  by  0  hops 

full-QT 

2.50 

18.40 

128.90 

14 

0  by  0  hops 

Table  3:  Number  of  paths  found  for  Internetwork  1. 


(3)  scales  well  to  large  number  of  domains. 

Our  protocol  uses  partial  domain-level  views  to  achieve  scaling  in  space  requirement.  It  floods 
domain-level  topological  changes  over  a  flood  area  to  achieve  scaling  in  communication  requirement. 

It  does  not  abstract  domains  into  superdomains;  hence  it  does  not  lose  any  domain-level  detail 
in  ToS  and  policy  information.  It  merges  a  sequence  of  partial  views  to  obtain  domain-level  source 
routes  between  nodes  which  are  far  away.  The  number  of  views  that  need  to  be  merged  is  bounded 
by  twice  the  number  of  levels  in  the  hierarchy. 

To  evaluate  and  compare  inter-domain  routing  protocols  against  each  other  and  against  sim¬ 
ple  approach,  we  presented  a  model  in  which  one  can  define  internetwork  topologies,  policy /ToS 
constraints,  inter-domain  routing  hierarchies,  and  evaluation  measures.  We  applied  this  model  to 
evaluate  our  viewserver  hierarchy  and  compared  it  to  the  simple  approach.  Our  results  indicate 
that  viewserver  hierarchy  finds  many  short  valid  paths  and  reduces  the  amount  of  memory  require¬ 
ment  by  two  order  of  magnitude. 

Our  protocol  recovers  from  fail-stop  failures  of  viewservers  and  gateways.  "When  a  viewserver 
fails,  an  address  which  includes  the  viewservers  id  becomes  useless.  This  deficiency  can  be  overcome 
by  Teplicating  each  viewserver  at  different  nodes  of  the  domain  (in  this  case  a  viewserver  fails  only 
if  all  nodes  implementing  it  fail}.  This  replication  scheme  requires  viewserver  ids  to  be  independent 
of  node  ids,  which  can  be  easily  accomplished28. 

2e  For  example,  If  node-ids  of  nodes  implementing  a  viewserver  share  a  prefix,  this  prefix  can  be  used  as  the 
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The  only  drawback  of  our  protocol  is  that  to  obtain  a  domain-level  source  route,  views  are 
merged  at  (or  prior  to)  the  connection  (or  flow)  setup,  thereby  increasing  the  setup  time.  This 
drawback  is  not  unique  to  our  scheme  [8,  13,  6,  10]- 

There  are  several  ways  to  reduce  the  setup  overhead.  First,  domain-level  source  routes  to  fre¬ 
quently  used  destinations  can  be  cached.  The  cacheing  period  would  depend  on  the  ToS  require¬ 
ment  of  the  applications  and  the  frequency  of  domain-level  topology  changes.  For  example,  the 
period  can  be  long  for  electronic  mail  since  it  does  not  require  shortest  paths. 

Second,  views  of  frequently  queried  viewservers  can  be  replicated  at  “mirror”  viewservers  in  the 
source  domain.  A  viewserver  would  periodically  update  the  views  of  its  mirror  viewservers. 

Third,  connection  setup  also  involves  traversing  the  name  server  hierarchy  (to  obtain  destination 
addresses  from  its  names).  By  integrating  the  name  server  hierarchy  with  the  viewserver  hierarchy, 
we  may  be  able  to  do  both  operations  simultaneously.  This  requires  further  investigation. 
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A  View-Update  Protocol  Event  Specifications 

The  events  of  gateway  g  are  specified  in  Figure  7.  When  a  gateway  g  recovers,  Celllds  is  set  to 
nodeid(g).  Thus,  when  g  next  executes  Updates,  it  sends  either  an  UpdateCell  or  a  DeleteCell 
message  to  viewservers,  depending  on  whether  it  is  no  longer  the  minimum  id  gateway  in  its  cell29. 

The  events  of  a  viewserver  x  are  specified  in  Figure  5.  Note  that  when  x  adds  an  entry  to 
DView-  (upon  receiving  a  UpdateCell  message),  it  selectively  chooses  subset  of  neighbors  from 
the  cost  set  in  the  packet  to  include  only  the  neighbor  domains  which  axe  in  SView~.  When 
a  viewserver  x  recovers,  DViewx  is  set  to  {}.  Its  view  becomes  up-to-date  as  it  receives  new 
information  from  reporting  gateways  (and  remove  false  information  with  the  time- to- die  period). 

25  Sending  a  DeleteCell  message  is  essential.  Because  prior  to  the  failure,  g  may  have  been  the  smallest  id 
gateway  in  its  cell.  Hence,  some  viewservei’s  may  still  contain  an  entry  for  its  old  domain  cell. 


308 


Updates  {Executed  periodically  and  also  optionally  upon  a  change  in  lntraDomainRTs  } 
{Determines  the  id  of  p’s  cell  and  initiates  UpdateCell  and  DeleteCell  messages  if  needed.} 
OldCellld  =  Cellld,] 

Celllds  :=  compute  cell  id  using  Local  Gateways/  and  lntraDomainRTs', 
if  nodeid(g)  =  Cellld,  then 

ncostset  :=  compute  costs  for  each  neighbor  domain  cell  using  lntraDomainRTs  ] 
floods ((UpdateCell,  domainid(g),  Celllds,  Clocks,  FloodAreas, ncostset))] 
endif 

if  nodetd(g)  =  OldCellld  ^  Cellld}  then 

/7oody  ((DeleteCell,  domainid(g),  nodeid(g),  Clock},  FloodAreas )); 
endif 

Receives (packet)  {either  an  UpdateCell  or  a  DeleteCell  packet} 

floodj  (packet) 

where  procedure  flood,  (pack cl) 

if  domainid(g)  £  packet. floodarea  then 

{remove  domain  of  g  from  the  flood  area  to  avoid  infinite  exchange  of  the  same  message.} 
packet. floodarea  :=  packet. floodarea  —  {domainid(g)}] 
for  all  h  €  LocalGateways s  U  LocalV iewservers ,  do 
Send(paciet)  to  h  using  (); 
endif 

for  all  h  €  Adj ForeignGateways s  A  domainid(h)  £  packet. floodarea  do 
Sen  d(packet)  to  h] 

Gateway  Failure  Model:  A  gateway  can  undergo  failures  and  recoveries  at  anytime.  We  assume  failures 
are  fail-stop  (i.e.  a  failed  gateway  does  not  send  erroneous  messages).  When  a  gateway  g  recovers,  Cellld g 
is  set  to  nodeid(g). 


Figure  7:  View-update  protocol:  Events  of  a  gateway  p. 


B  Results  for  Other  Internetworks 


Results  for  Internetwork  2 

The  parameters  of  the  second  internetwork  topology,  referred  to  as  Internetwork  2,  are  the  same 
as  the  parameters  of  Internetwork  1  (a  different  seed  is  used  for  the  random  number  generation). 

Our  evaluation  measures  were  computed  for  -a  set  of  1000  source-destination  pairs.  The  mini¬ 
mum  spl  of  these  pairs  was  2,  the  maximum  spl  was  13,  and  the  average  spl  was  7.2. 

Table  4  and  Table  5  shows  the  results.  Similar  conclusions  to  Internetwork  1  hold  for  Internet¬ 
work  2.  In  Table  5,  the  reason  that  local  and  QT  schemes  have  more  pairs  with  longer  paths  than 
the  base  scheme  is  that  these  schemes  found  some  paths  (which  are  not  shortest)  for  some  pairs  for 
which  the  base  scheme  did  not  find  any  path. 
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Reccivts  (IJpdateCell,  did ,  cid,  ts,  FloodArea,  ncset ) 
if  did  £  Precincts  then 

if  3(did:rid,  timestamp,  expirytime,  deleted ,  ncostset)  £  DViewx  A 
ts  >  timestamp  then  {received  is  more  recent;  delete  the  old  one} 

delete  (did:dd,  timestamp,  expirytime,  deleted,  ncostset)  from  DViewx\ 
endif 

if -3 (did:dd,  timestamp,  expirytime,  deleted,  ncostset)  £  DV iewx  then 
Choose  ncostset  from  ncset  using  SViewx; 

insert  { did:dd ,  ts,  Clockx  +  TimeToDiex,  ialse,  ncostset)  to  DViewx\ 
endif 
endif 

Receivex  (DeleteCell,  did,  cid,  ts,  floodarea ) 
if  did  £  Precincts  then 

if  B(did:dd,  timestamp,  expirytime,  deleted,  ncostset)  £  DViewx  A 
ts  >  timestamp  then  {received  is  more  recent;  delete  the  old  one} 

delete  { did:dd ,  timestamp,  expirytime,  deleted,  ncostset)  from  DViewx] 
endif 

if  — >3 (didrdd,  timestamp,  expirytime,  deleted,  ncostset)  £  DV iewx  then 
insert  ( did:dd ,  ts,  Clockx-i- TimeToDiex,  tme,{}}  to  DViewx; 
endif 
endif 

Delete.  {Executed  periodically  to  delete  entries  older  than  the  time-to-die  period} 

for  all  (A:g,  tstamp,  expirytime,  deleted,  ncset)  £  DViewx  A  expirytime  <  Clockx  do 
delete  (A:g,  tstamp,  expirytime,  deleted,  ncset)  from  DViewx', 

Viewserver  Failure  Model:  A  viewserver  can  undergo  failures  and  recoveries  at  anytime.  We  assume 
failures  are  fail-stop.  When  a  viewserver  x  recovers,  DViewz  is  set  to  {}. 


Figure  8:  View  update  events  of  a  viewserver  x. 


Scheme 

— 

Precinct  Size 

Merged  View  Size 

No.  of  Viewservers  Queried 

base 

2  /  3.2  /  76 

.  4  /  66.62  /  96 

3  /  7.55  /  8 

base-QT 

2  /  3.2  /  76 

29  /  72.76  /  96 

8  /  8.00  /  8 

locals 

3  /  69.8  /  149 

4  /  101.32  /  148 

2  /  7.36  /  8 

locals- QT 

3  /  69.8  /  149 

35  /  110.32  /  152 

8  /  8.00  /  8 

vertex-extension 

3  /  19.47  /  817 

15  /  339.60  /  469 

3  /  7.55  /  8 

full-QT 

11  /  135.2  /  817 

186  /  402.51  /  503 

8  /  8.00  /  8 

Table  4:  Precinct  sizes,  merged  view  sizes,  and  no  of  viewservers  queried  for  internetwork  2. 
Results  for  Internetwork  3 

The  parameters  of  the  third  internetwork  topology,  referred  to  as  Internetwork  3,  are  shown  in 
Table  6.  Internetwork  3  is  more  connected,  more  class  0,  1  and  2  domains  are  green,  and  more 
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Number  of  paths  found 

No.  of  pairs 

- K - -  1 

No.  or  pairs 

Scheme 

spl 

spl  +  1 

spl  -f  2 

with  no  path 

with  longer  paths 

simple 

2.21 

13.22 

74.30 

N/A 

N/A 

base 

1.98 

8.20 

34.40 

123 

13  by  1.08  hops 

■base-QT 

1.98 

8.36 

35.62 

110 

15  by  1.13  hops 

locals 

2.08 

9.18 

40.50 

97 

23  by  1.39  hops 

locals- QT 

2.08 

9.38 

42.08 

67 

23  by  1.30  hops 

vertex-extension 

2.18 

12.57 

64.98 

19 

6  by  1  hop 

full-QT 

2.19 

12.85 

67.37 

4 

4  by  1  hop 

Table  5:  Number  of  paths  found  for  Internetwork  2. 


class  3  domains  are  red.  Hence,  we  expect  more  valid  paths  between  source  and  destination  pairs. 

Our  evaluation  measures  were  computed  for  a  set  of  1000  source-destination  pairs.  The  mini¬ 
mum  spl  of  these  pairs  was  2,  the  maximum  spl  was  10,  and  the  average  spl  was  5.93. 


Class  i 

No.  of  Domains 

No.  of  Regions30 

%  of  Green  Domains 

B 

ad  j 

Far 

0 

10 

4 

0.85 

0 

8 

7 

0 

1 

100 

16 

0.80 

0 

190 

20 

0 

1 

50 

20 

0 

2 

1000 

64 

0.75 

500 

50 

0 

1 

1200 

100 

0 

2 

200 

40 

0 

3 

10000 

256 

0.10 

0 

300 

50 

0 

1 

250 

100 

0 

2 

10250 

150 

50 

3 

200 

150 

100 

Table  6:  Parameters  of  Internetwork  3. 


30 Brandling  factor  is  4  for  all  domain  classes. 
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Table  7  and  Table  8  shows  the  results.  Similar  conclusions  to  Internetwork  1  and  2  hold  for 
Internetwork  3. 


Scheme 

Precinct  Size 

Merged  View  Size 

No.  of  Viewservers  Queried 

base 

2/3.5  /  171 

5  /  134.41  /  206 

3  /  7.26  /  8 

base-QT 

2  /  3.5  /  171 

55  /  154.51  /  206 

8  /  8.00  /  8 

locals 

3  /  70.17  /  171 

4  /  164.16  /  257 

2  /  7.09  /  8 

locals- QT 

3  /  70.17  /  171 

57  /  191.06  /  258 

8  /  8.00  /  8 

vertex-extension 

5  /  34.17  /  1986 

18  /  601.56  /  695 

3  /  7.26  /  8 

full-QT 

14  /  155.5  /  1986 

503  /  655.79  /  743 

8  /  8.00  /  8 

Table  7:  Precinct  sizes,  merged  view  sizes,  and  no  of  viewservers  queried  for  Internetwork  3. 


Scheme 

1 

jer  of  pat 

spl  +  1 

hs  found 

spl  +  2 

No.  of  pairs 

with  no  path 

No.  of  pairs 

with  longer  paths 

simple 

3.34 

37.55 

368.97 

N/A 

N/A 

base 

2.83 

24.25 

178.08 

17 

11  by  1.09  hops 

base-QT 

2.87 

25.53 

193.41 

12 

8  by  1.12  hops 

locals 

196.33 

21 

8  by  1  hop 

locals- QT 

2.97 

27.59 

219.63 

2 

6  by  1  hop 

vertex-extension 

3.32 

* 

35.73 

332.54 

5 

1  by  1  hop 

fvll-QT 

3.33 

36.47 

346.44 

0 

0  by  0  hops 

Table  8:  Number  of  paths  found  for  Internetwork  3. 


Figure  9  through  Figure  11  show  the  number  of  spl,  spl  ■+■  1  and  spl  +  2  length  paths  found  by 
the  schemes  as  a  function  of  spl  (we  only  show  results  for  spl  values  for  which  more  than  10  pairs 
were  found).  We  do  not  include  base-QT,  locals  and  locals- QT  schemes  since  they  are  very  close 
to  base  scheme.  As  expected,  as  spl  increases,  the  number  of  paths  for  a  source-destination  pair 
increases,  and  the  gap  between  the  simple  scheme  and  the  viewserver  schemes  increases. 


spl+2  lenglh  paths  lound  sp!>  1  length  pntlis  (ound  spl  length  paths  found 


3  4  5  6  7  8  9 

spi  length 

Figure  9:  Number  of  spl  length  paths  found  for  Internetwork  3. 


spl  length 


Figure  10:  Number  of  spl  -r  1  length  paths  found  for  Internetwork  3. 


3  4  5.6  7  8  9 

spl  length 

Figure  11:  Number  of  spl  +  2  length  paths  found  for  Internetwork  3. 
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Abstract 
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u 

T2 

T3 

T4 

Static  time  driven  scheduling  has  been  advocated  for  use  in 
Hard  Real-Time  systems  and  is  particularly  appropriate  for 
many  embedded  systems.  The  approaches  taken  for  static 
scheduling  often  use  search  techniques  and  may  reduce  the 
search  by  using  heuristics.  In  this  paper  we  present  a  tech¬ 
nique  for  analyzing  the  temporal  relations  among  the  tasks, 
based  on  non-preemptive  schedulability.  The  relationships 
can  be  used  effectively  to  reduce  the  average  complexity  of 
scheduling  these  tasks.  They  also  serve  as  a  basis  for  selective 
preemption  policies  for  scheduling  by  providing  an  early  test 
for  infeasibility.  We  present  examples  and  simulation  results 
to  confirm  the  usefulness  of  temporal  analysis  as  a  phase  prior 
to  scheduling. 

1  Introduction 

Many  safety  critical  real-time  applications  like  process  con¬ 
trol,  embedded  tactical  systems  for  military  applications,  air- 
traffic  control,  robotics  etc.  have  stringent  timing  constraints 
imposed  on  their  computations  due  to  the  characteristics  of 
the  physical  system.  A  failure  to  observe  the  timing  con¬ 
straints  can  result  in  intolerable  system  degradation  and  in 
some  cases  it  may  have  catastrophic  consequences. 

Scheduling  is  the  primary  means  of  ensuring  the  satisfaction 
of  timing  constraints  for  such  systems[l].  As  a  result,  signif¬ 
icant  effort  has  been  invested  in  research  on  hard  real  time 
scheduling  [2,  3,  4].  In  this  paper  we  discuss  a  scheduling 
technique  for  static  scheduling  to  guarantee  timely  execution 
of  time  critical  tasks. 

The  time  driven  scheduling  model  is  being  used  by  many 
experimental  systems,  including  MARS[5],  MARUTI[6]  and 
Spring  Kernel[7],  The  static  time  driven  scheduling  technique 
involves  constructing  a  schedule  offline,  which  may  be  repre¬ 
sented  as  a  Gantt  chart[8]  or  calendar[6]  (Figure  1).  Tasks 
are  invoked  at  run-time  whenever  they  are  scheduled  to  exe¬ 
cute.  Such  a  scheduling  model  is  particularly  appropriate  for 
many  embedded  systems.  Recent  effort  in  this  direction  has 
shown  the  viability  of  such  an  approach  for  practical  real-time 
applications^]. 

’This  research  was  supported  in  part  by  ONR  and  DARPA  under 
contract  N00014-91-C-0195. 


Figure  1:  Gantt  Chart  or  Calendar 

The  intractability  of  most  scheduling  problems  has  led  to 
app roadies  based  on  search  techniques  for  scheduling  of  real¬ 
time  tasks.  The  feasibility  of  a  task  set  is  determined  through 
construction  of  a  schedule;  failure  to  construct  a  schedule 
denotes  infeasibility.  Heuristics  are  often  used  as  a  means 
of  controlling  the  complexity  of  scheduling.  In  many  cases, 
heuristics  perform  well  enough  to  result  in  an  acceptable  so¬ 
lution. 

There  has  been  little  emphasis  on  the  use  of  analytic  tech- 
niques  to  assist  in  time  driven  scheduling.  Decomposition 
scheduling[10]  based  on  dominance  properties  of  sequences[ll] 
uses  analytic  techniques  to  decompose  a  set  of  tasks  into  a  se¬ 
quence  of  subsets.  Significant  reduction  in  average  complexity 
can  be  achieved  if  the  set  of  tasks  can  be  decomposed  into  a 
large  number  of  subsets,  each  having  a  small  number  of  tasks. 

In  this  paper,  we  present  an  analysis  technique  for  time 
driven  scheduling  based  on  the  timing  requirements  of  tasks. 
The  analysis  results  in  the  establishment  of  a  set  of  temporal 
relations  between  pairs  of  tasks  based  on  a  non-preemptive 
scheduling  model.  These  relations  can  be  used  by  scheduling 
algorithms  to  reduce  the  complexity  of  scheduling  in  the  av¬ 
erage  case,  and  as  an  early  test  for  infeasibility.  As  a  test  for 
infeasibility,  it  provides  a  good  basis  for  policies  using  selec¬ 
tive  preemption  to  enhance  feasibility.  When  infeasibility  is 
not  detected,  the  temporal  relations  may  be  used  by  a  search 
algorithm  to  effectively  prune  large  portions  of  search  space, 
thereby  controlling  the  cost  of  scheduling. 

2  Time  Driven  Scheduling 

The  time  driven  scheduling  approach  constructs  a  calendar 
for  the  set  of  tasks  in  the  system.  The  tasks  may  be  sched¬ 
uled  preemptively  or  non-preemptively.  The  non-preemptive 
scheduling  problem  for  a  uniprocessor  is  known  to  be  NP- 
Complete[12],  When  the  tasks  are  mutually  independent,  and 
can  be  preempted  at  any  time,  it  is  known  that  the  earliest, 
deadline  first  policy  is  optimal[13]  and  obviates  the  need  for 
non-preemptive  scheduling.  However,  when  tasks  synchronize 
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using  critical  sections,  the  preemptive  scheduling  problem  is 
also  known  to  be  intractable(NP-Hard)[14]. 

In  general,  when  the  overhead  of  preemption  is  negligible, 
the  non-preemptive  solutions  form  a  subset  of  preemptive 
solutions[8].  However,  when  tasks  may  interact  with  each 
other,  the  non-preemptive  models  are  simpler,  easier  to  im¬ 
plement  and  closer  to  reality [15].  They  are  also  necessary  for 
certain  scheduling  domains  like  I/O  scheduling  and  provide  a 
basis  for  selective  preemption  policies. 

2.1  Task  Model 

We  consider  a  set  of  n  tasks  T  =  {r<  :  i  —  1,2, ...,n}  to 
be  scheduled  for  execution  on  a  single  processor.  Each  task 
Tj ,  abbreviated  as  i,  is  a  3-tuple  [r<,  c<,  d,]  denoting  the  ready 
time,  computation  time  and  deadline  respectively.  The  time 
interval  [r» ,  d^]  is  called  the  timing  window  Wi  of  task  r,-,  and 
indicates  the  time  interval  during  which  the  task  can  execute. 
The  computation  time  of  each  task  is  less  than  the  window 
length  All  tasks  are  assumed  to  be  independent  for 

simplicity  of  exposition  even  though  such  a  requirement  is 
not  necessary  for  the  analysis. 

In  a  hard  real-time  system,  processes  may  be  periodic  or 
sporadic  [14].  Such  a  set  of  processes  may  be  mapped  to  our 
scheduling  model  by  techniques  identified  in  [1,  14,  16]  and 
constructing  a  schedule  for  the  least  common  multiple  of  the 
periods  of  the  tasks. 

2.2  Non-Preemptive  Scheduling  Model 

A  non-preemptive  schedule  is  the  mapping  of  each  task  r<  in  T 
to  a  start  time  s,- .  The  task  is  then  scheduled  to  run  without 
preemption  in  the  time  interval  [s,-,/,-],  with  its  finish  time 
being  /,  =  st-  +  c,-.  A  feasible  schedule  is  a  schedule  in  which 
the  following  conditions  are  satisfied  for  each  task  r,-: 

ri  <  Si  (1) 

/.  <  di  (2) 

It  is  useful  to  consider  a  non-preemptive  schedule  as  an 
ordered  sequence  of  the  set  of  tasks.  To  get  a  maximally 
packed  schedule  from  a  sequence  [t\,  t2,  . . . ,  r„],  we  can  re¬ 
cursively  derive  the  start  time  s,-  and  finish  time  /,•  of  the 
tasks  as  follows: 

Si  =  max(r, •,/,_!)  (3) 

fi  =  Si  +  a  (4) 

with 

si  =  rj 

The  scheduling  problem  can  thus  be  considered  as  a  search 
over  the  permutation  space.  A  permutation  (sequence)  is  fea¬ 
sible  if  the  corresponding  schedule  is  feasible.  Notice  that 
for  any  permutation  schedule  derived  as  above,  equation  1  is 
implied  by  (3)  and  we  only  need  to  verify  the  deadline  con¬ 
straints  for  the  tasks. 

1|wil  =  d,  -  r,- 
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i  i 
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Figure  2:  Infeasibility  of  task  A  executing  before  task  B 

3  Temporal  Analysis 

Temporal  analysis  uses  pairwise  schedulability  analysis  of 
tasks  to  generate  a  set  of  relations  to  eliminate  sequences 
which  cannot  lead  to  feasible  solutions.  In  this  section  we  de¬ 
fine  the  temporal  relations  and  show  how  they  may  be  derived 
from  the  timing  constraints  of  tasks. 

3.1  Definitions  of  Temporal  Relations 

Consider  two  tasks  ry  and  Tj.  We  wish  to  find  out  what  we 
can  say  about  the  relative  ordering  of  these  tasks,  given  their 
timing  constraints.  A  set  of  relations  are  identified  below 
which  identify  the  different  possibilities. 

Precedence  Relation:  A  precedence  relation  denoted  as 
Ti  — ►  Tj,  implies  that  in  any  feasible  schedule  r,-  must 
execute  before  r;- . 

Infeasible  Relation:  An  infeasible  relation  denoted  by 
n  0  Tj  implies  that  in  any  feasible  schedule,  t;  and  Tj 
cannot  run  in  a  sequential  order. 

Concurrent  Relation:  r,  ||  if  there  is  no  precedence  or 
infeasible  relation  between  them.  A  concurrent  relation 
indicates  that  a  feasible  schedule  may  exist  with  any  or¬ 
der  of  the  tasks  r,-  and  Tj.  It  does  not,  however,  indicate 
the  existence  of  a  feasible  schedule. 

For  each  task  t;  let  us  define  two  terms  Ci  and  denoting 
the  earliest  finish  time  and  the  latest  start  time  as: 

Si  =  Ti  +  a  (5) 

h  =  d,  -  Ci  (6) 


A  preliminary  set  of  relations  can  be  established  using  the 


following  rules,  for  every  pair  of  tasks  r,-  and  Tj . 

(ei  —  h )  A  (/,'  <  Cj)  =>  ~i  - -  Tj 

(') 

(c,  >  Ij)  A  (/j  >  Cj ) 

=>  Tj  - *  Ti 

(8) 

(e;  <lj)  A  (/,•  >  Cj) 

=>  Ti  ||  Tj 

(9) 

(e,-  >  lj )  A  (/,•  <  e j ) 

=>  Ti  0  Tj 

(10) 

The  basic  idea  is  that  if  the  earliest  finish  time  of  a  task  A  is 
greater  than  the  latest  start  time  of  a  task  B,  then  a  feasible 
schedule  cannot  be  found  in  which  A  is  scheduled  before  B 
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Figure  3:  Window  Modification  (A  — *  B) 
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Figure  5:  Precedence  Graph  for  example  of  Figure  4 

(Figure  2).  Thus,  for  instance  the  first  part  of  condition  for 
rule  10  says  that  r,  cannot  precede  t<  ,  and  the  second  part 
says  that  tj  cannot  precede  r,-,  establishing  the  infeasible  re¬ 
lation. 

3.2  Window  Modification 

Consider  two  tasks  r0  and  rj,  and  a  precedence  relation 
ra  — •  rj  between  them.  As  this  indicates  that  in  any  fea¬ 
sible  schedule  r0  must  precede  n,  we  can  update  the  timing 
windows,  as  follows  (Figure  3): 

d'a  =  min(da,lb)  (11) 

r'b  =  max(rb,ea)  (12) 

The  window  modification  does  not  alter  the  scheduling 
problem  in  the  sense  that  every  feasible  sequence  with  the 
original  timing  constraints  is  a  feasible  sequence  with  the 
modified  timing  constraints  and  vice-versa.  Further,  the 
schedules  for  feasible  sequences  are  identical  in  both  cases. 
A  task’s  window  may  shrink  because  of  window  modification. 
This  may  lead  to  a  change  in  the  relation  of  the  modified  task 
with  other  tasks.  The  procedure  may  be  applied  iteratively 
till  no  further  changes  can  be  made  or  an  infeasible  relation 
is  detected. 

3.3  Examples 

(a)  Consider  a  set  of  five  tasks  as  shown  in  Figure  4.  The 
temporal  analysis  leads  us  to  the  following  set  of  prece¬ 
dence  relations,  sans  the  redundant  ones: 

{r2  — *  r3,  T\  — *  r5,  r3  — »  r5,  r4  — *  r5} 

The  set  of  precedence  relations  may  be  represented  as 
a  precedence  graph  (Figure  5)  and  impose  a  partial  or¬ 
der  on  the  task  set.  Only  sequences  which  are  consistent 


with  this  partial  order  need  to  be  considered  for  schedul¬ 
ing.  For  5  tasks,  the  total  number  of  permutations  is 
120(=  5!).  The  number  of  total  orders  consistent  with 
the  partial  order  of  Figure  5  is  12,  which  is  a  drastic  re¬ 
duction  in  the  number  of  sequences  that  need  to  be  con¬ 
sidered  for  scheduling.  The  modified  task  set  is  shown  in 
Figure  4(b),  with  the  modified  values  in  bold. 

(b)  As  another  example,  consider  the  set  of  4  tasks  as  shown 
in  Figure  6.  The  task  set  in  different  stages  of  temporal 
analysis  is  shown,  with  the  new  temporal  relations2  at 
each  stage.  This  example  shows  how  successive  refine¬ 
ment  of  temporal  relations  can  lead  to  detecting  infeasi¬ 
bility. 

3.4  Complexity  of  Temporal  Analysis 

It  is  easy  to  see  that  the  initial  set  of  relations  can  be  estab¬ 
lished  in  0(n2)  time.  Further,  each  phase  of  refinement  also 
takes  no  more  than  0(n2).  An  upper  bound  for  the  num¬ 
ber  of  phases  is  n.  Therefore,  the  worst  case  complexity  of 
temporal  analysis  is  0(n3).  In  practice,  however,  the  cost  of 
temporal  analysis  can  be  significantly  less  since  concurrent  re¬ 
lations  and  relations  between  non-overlapping  tasks  need  not 
be  generated  explicitly.  Furthermore,  the  number  of  phases 
required  to  stabilize  window  modification  can  be  reduced  if 
the  release  times  are  modified  in  the  topological  sort  order 
of  the  precedence  graph  and  deadlines  are  modified  in  the 
reverse  topological  sort  order. 

In  any  case,  the  cost  of  temporal  analysis  for  static  schedul¬ 
ing  is  not  significant  when  used  in  conjunction  with  an  expo¬ 
nential  time  scheduling  algorithm.  In  section  5,  we  show  em¬ 
pirically  that  the  cost  of  temporal  analysis  is  not  a  significant 
factor  for  static  scheduling. 

4  Non  Preemptive  Scheduling  using 
Temporal  Analysis 

The  relations  established  through  temporal  analysis  serve  as  a 
basis  for  scheduling  of  the  tasks.  Temporal  analysis  may  thus 
be  perceived  as  a  pre-processing  stage  for  scheduling.  The 
result  of  this  pre-processing  stage  is  one  of  the  following: 

1.  The  task  set  was  detected  to  be  infeasible,  due  to  the 
existence  of  one  or  more  infeasible  relations. 

2.  A  set  of  precedence  relations  were  established  generat¬ 
ing  a  precedence  graph.  The  precedence  graph  imposes 
a  partial  order  on  the  set  of  tasks.  It  serves  as  an  in¬ 
put  to  the  scheduler  which  may  exploit  the  partial  order 
generated  to  prune  the  search  space. 

4.1  Detecting  Infeasibility 

Whenever,  an  infeasible  relation  exists  between  two  tasks,  it 
is  known  that  no  ordering  of  the  two  tasks  is  feasible.  Thus, 

2  Concun-ent  Relations  are  not  shown. 
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Figure  4:  Window  Modification:  (a)  Original  Task  Set  (b)  Task  Set  after  Temporal  Analysis 
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Figure  6:  Example  for  Determining  Infeasibility  with  Temporal  Analysis 


the  detection  of  an  infeasible  relation  at  any  stage  in  tem¬ 
poral  analysis  indicates  that  the  task  set  is  infeasible.  Even 
though  only  pairwise  schedulability  analysis  is  used  for  estab¬ 
lishing  relations,  successive  refinement  of  relations  results  in 
a  possible  percolation  of  this  effect  to  other  tasks  too.  This 
effect  is  exemplified  in  the  example  of  Figure  6,  where  sev¬ 
eral  iterations  lead  to  a  infeasible  relation.  It  must  be  noted 
that  whenever  infeasibility  is  detected,  the  resulting  task  set 
and  their  relations  also  provide  a  good  feedback  as  to  what 
caused  it.  The  feedback  information  may  be  used  to  allo¬ 
cate  more  resources,  change  resource  allocation  or  allow  for 
selective  preemption  as  the  case  may  be. 

4.2  Search  Technique  for  Scheduling 

The  intractability  of  non-preemptive  scheduling  has  led  to 
implicit  enumeration  techniques  based  on  branch  and  bound 
search  methods.  The  search  space  is  the  set  of  all  possible 
permutation  sequences.  One  way  of  enumerating  schedules  is 
to  generate  an  initial  schedule  and  then  successively  refine  it 
using  heuristics  to  generate  “better”  schedules,  until  a  feasible 
schedule  is  obtained  [3,  17,  16]. 

In  this  paper,  we  concentrate  on  another  enumeration 
method  which  constructs  a  schedule  in  an  incremental  man¬ 
ner.  Variants  of  this  method  have  been  used  in  [4,  18,  19, 
20,  21]  The  search  space  is  represented  as  a  search  tree.  The 


root  (level  0)  of  the  tree  is  an  empty  schedule.  The  nodes 
of  the  tree  represent  partial  schedules.  A  node  at  level  k 
gives  a  partial  schedule  with  k  tasks.  The  leaves  are  complete 
schedules.  The  successors  of  an  intermediate  node  are  imme¬ 
diate  extensions  of  the  partial  schedule  corresponding  to  that 
node.  From  a  node  at  level  k,  there  are  at  most  n  —  k  branches 
with  each  branch  corresponding  to  an  extension  of  the  partial 
schedule  by  appending  one  more  task  to  the  schedule.  Search 
is  done  in  a  branch  and  bound  manner,  wherein  parts  of  the 
search  tree  are  pruned  when  it  is  determined  that  no  feasible 
schedule  can  arise  from  them.  For  each  node  being  expanded, 
the  following  conditions  must  hold. 

1.  All  immediate  extensions  of  the  node  must  be  feasible 
[4,  18]. 

2.  The  remaining  computational  demand  must  not  exceed 
the  difference  between  the  largest  deadline  of  remaining 
tasks  and  current  scheduling  time  [4], 

If  any  condition  is  violated  then  no  feasible  schedule  can 
be  generated  in  the  subtree  originating  from  this  node.  No 
search  is  conducted  on  the  subtree  rooted  at  such  a  node. 

4.2.1  Heuristically  Guided  Scheduling 

Heuristics  are  commonly  used  to  guide  search  in  many  combi¬ 
natorial  searching  problems.  For  non-preemptive  scheduling 
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heuristics  may  be  used  to  guide  search  along  paths  which  are 
more  likely  to  lead  feasible  schedules.  Search  is  done  in  a 
depth  first  manner  until  either  a  complete  feasible  schedule  is 
found,  in  which  case  the  search  terminates,  or  it  is  determined 
that  no  possible  extensions  of  the  current  node  can  lead  to  a 
feasible  schedule.  Heuristics  are  used  to  determine  which  of 
the  many  children  of  a  node  should  be  searched  next.  Back¬ 
tracking  takes  place  when  no  further  extensions  of  a  node  can 
be  made.  We  evaluate  temporal  analysis  using  such  a  heuris¬ 
tic  search  for  scheduling. 

5  Empirical  Evaluation  of  Temporal 
Analysis 

In  the  previous  sections,  we  have  shown  how  temporal  anal¬ 
ysis  may  be  used  to  restrict  the  search  space  for  scheduling. 
Clearly,  the  existence  of  even  a  few  precedence  relations  re¬ 
sults  in  a  drastic  reduction  of  search  space3.  However,  the 
usefulness  of  the  scheme  is  not  obvious  since  we  are  only  in¬ 
terested  in  feasible  schedules,  hence  a  large  part  of  the  search 
space  may  never  need  to  be  examined.  We  have  conducted 
various  simulations  to  verify  that  indeed  temporal  analysis 
results  in  improved  performance  for  scheduling.  For  reasons 
of  space,  we  mention  only  a  few  significant  results. 

We  used  a  heuristic  search  technique  for  scheduling  as  de¬ 
scribed  in  section  4.2.  The  heuristic  used  for  our  simulation 
study  was  a  two  level  heuristic.  The  primary  heuristic  was 
earliest  start  ri'me(EST). 

ESTi  =  max{ri,fk) 

where  k  is  the  last  task  in  the  partial  schedule  at  that  node. 

In  the  case  of  a  conflict,  the  secondary  heuristic  earliest 
deadline  was  used.  Further  conflicts  were  resolved  arbitrarily. 
The  heuristic  has  a  natural  intuitive  appeal  and  is  known  to 
produce  good  results  among  linear  heuristics[22]. 

For  each  set  of  parameters,  we  generated  200  “feasible” 
task  sets  with  100  tasks  each.  The  task  sets  were  gener¬ 
ated  with  100%  utilization  as  this  presents  the  most  difficulty 
for  scheduling.  The  computation  times  were  generated  using 
uniform  distributions  and  laxities  using  normal  distribution. 
We  compared  the  success  ■percentage  (i.e.  percentage  of  suc¬ 
cessfully  scheduled  task  sets)  of  scheduling  with  and  without 
temporal  analysis  as  a  pre-processing  stage.  The  success  per¬ 
centage  (SP)  is  plotted  against  “cut-off-time”,  indicating  the 
maximum  time  allowed  to  the  scheduling  algorithm  to  suc¬ 
cessfully  generate  a  schedule. 

Our  simulation  results  show  that  temporal  analysis  is  not 
needed  for  scheduling  when  both  the  mean  and  the  variation 
in  laxities  is  low  since  the  simple  heuristics  were  able  to  sched¬ 
ule  almost  all  task  sets  (success  ratio  «  1.0).  However,  when 
the  laxities  are  high  (as  compared  to  computation  times)  and 
the  variation  in  laxities  is  also  high4,  then  the  heuristics  do 

3 Even  one  relation  reduces  the  search  space  by  half. 

4  Note  that  the  task  set  utilization  is  100% 


Figure  7:  Success  Ratio  vs  Cut-off-Time,  yc  —  5.0//c, 
COVc  =  1.0 

not  perform  as  well  and  the  use  of  temporal  analysis  results 
in  10  —  20%  improvement  in  success  ratio. 

As  an  illustration,  we  show  a  few  plots  which  plot  the 
success  percentage  (SP)  of  scheduling  with  temporal  analy¬ 
sis  (TAS)  contrasted  with  success  percentage  of  scheduling 
without  temporal  analysis,  i.e.  the  baseline  scheduling  model 
(BM).  For  scheduling  with  temporal  analysis,  we  consider  two 
cases,  one  in  which  overhead  of  temporal  analysis  is  added  to 
scheduling  time  (TAS+)  and  the  other  in  which  it  is  not  (TAS- 
).  The  parameters  varied  are  the  mean  laxity  yc  in  terms  of 
mean  computation  times  yc,  and  the  coefficient  of  variation 
for  laxity  COVc  ■  Figures  7  and  8  show  the  plots  for  low 
laxity  mean  with  low  and  high  variation.  For  this  case,  there 
is  no  significant  performance  improvement  due  to  temporal 
analysis  and  both  schemes  achieve  almost  100%  success  per¬ 
centage.  On  the  other  hand  when  the  average  laxity  is  high 
(Figures  9  and  10),  coupled  with  high  variation,  we  see  that 
temporal  analysis  results  in  significant  improvement  in  per¬ 
formance.  The  plots  also  show  that  the  curves  for  (TAS+) 
and  (TAS-)  are  almost  identical  showing  that  the  overhead  of 
temporal  analysis  is  minimal  when  compared  to  the  schedulng 
costs. 

6  Concluding  Remarks 

In  this  paper  we  have  presented  temporal  analysis  as  a  tech¬ 
nique  for  analyzing  the  timing  relationships  among  a  set  of 
tasks  to  establish  constraints  on  scheduling  which  are  dis¬ 
cernible  from  a  pairwise  analysis.  The  implications  and  the 
benefits  of  the  approach  as  a  pre-processing  stage  for  schedul¬ 
ing  has  been  shown  through  examples  and  simulation. 

Time  Driven  Scheduling  theory  has  relied  heavily  on  search 
techniques  for  scheduling  and  little  work  has  been  done  in 
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Figure  8:  Success  Ratio  vs  Cut-off-Time,  pc  =  5.0 pc, 
COVc  =  2.0 

developing  analytic  techniques.  Temporal  analysis  is  a  step 
in  this  direction  and  provides  an  efficient  way  of  analyzing  a 
task  set  and  deducing  valuable  information  for  scheduling. 

The  existence  of  an  infeasible  relation  in  a  task  set  gives 
a  sufficient  condition  for  infeasibility.  This  provides  an  early 
test  for  infeasibility,  which  can  then  be  used  as  a  basis  for 
selective  preemption  to  enhance  feasibility.  Alternatively,  the 
detection  of  infeasibility  may  be  used  to  allocate  more  re¬ 
sources  or  change  resource  allocation. 

The  precedence  relations  generated  as  a  result  of  temporal 
analysis  impose  a  partial  order  on  the  task  set  and  may  be 
effectively  used  to  prune  the  search  space  for  scheduling.  Our 
simulations  confirm  that  temporal  analysis  helps  in  improving 
the  performance  of  a  scheduling  algorithm  without  incurring 
a  significant  overhead.  In  the  simplest  scheduling  case,  when 
heuristics  perform  very  well,  temporal  analysis  might  be  per¬ 
ceived  as  a  way  of  formalizing  the  heuristics.  For  static  time 
driven  scheduling  to  be  a  feasible  technique,  it  becomes  im¬ 
perative  that  the  scheduling  cost  be  controlled  as  the  size  of 
the  problem  increases.  Temporal  analysis  provides  a  step  in 
the  right  direction. 

In  this  paper  we  have  been  concerned  with  single  proces¬ 
sor  scheduling.  An  interesting  extension  of  temporal  analysis 
would  be  to  use  it  for  multi-processor  scheduling.  One  way 
to  extend  the  analysis  to  multi-processor  scheduling  is  to  per¬ 
form  it  in  two  phases.  In  the  first  phase  the  infeasible  and 
concurrent  relations  may  be  used  to  obtain  an  allocation  of 
tasks  to  processors.  Then  in  the  second  phase,  the  analy¬ 
sis  shown  in  this  paper  can  be  used  for  each  processor  for 
scheduling. 

Many  real-time  system  specifications  impose  relative  tim¬ 
ing  constraints  on  the  tasks[23,  24].  In  this  paper,  we  have 
restricted  ourselves  to  absolute  constraints  on  the  start  and 
finish  times  of  tasks.  When  more  complex  constraints  are 


Figure  9:  Success  Percentage  vs  Cut-off-Time,  pc  —  10.0/rc, 
COVc  =  1.0 

imposed  on  tasks,  the  role  of  temporal  analysis  in  reducing 
the  search  space  becomes  even  more  important  since  simple 
heuristics  are  unlikely  to  perform  well.  It  would  be  interest¬ 
ing  to  see  how  temporal  analysis  can  be  extended  to  use  such 
constraints  to  further  prune  the  search  space. 

We  are  currently  implementing  a  scheduling  tool  based  on 
the  results  shown  in  this  paper.  The  tool  is  being  developed 
for  the  MARUTI  project,  an  experimental  real-time  system 
prototype  being  developed  at  the  University  of  Maryland, 
based  on  the  concept  of  pre-scheduling[6]. 
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Abstract 

The  Maruti  Real-Time  Operating  System  was  developed  for  applications  that  must 
meet  hard  real-time  constraints.  In  order  to  schedule  real-time  applications,  the  timing 
and  resource  requirements  for  the  application  must  be  determined.  The  development 
environment  provided  for  Maruti  applications  consists  of  several  stages  that  use  various 
tools  to  assist  the  programmer  in  creating  an  application.  By  analyzing  the  source  code 
provided  by  the  programmer,  these  tools  can  extract  and  analyze  the  needed  timing  and 
resource  requirements.  The  initial  stage  it  development  is  the  compilation  of  the  source 
code  for  an  application  written  in  the  Maruti  Programming  Language  (MPL).  MPL  is 
based  on  the  C  programming  language.  The  MPL  Compiler  was  developed  to  provide 
support  for  requirement  specification.  This  report  introduces  MPL  and  describes  the 
implementation  of  the  MPL  Compiler. 
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1  Introduction 


A  real-time  system  requires  that  an  application  meet  the  timing  constraints  specified  for  it. 
For  hard  real-time ,  a  failure  to  meet  the  specified  timing  constraints  may  result  in  a  fatal 
error  [2].  Timing  constraints  are  not  as  critical  for  soft  real-time.  The  Maruti  Operating 
System  was  developed  to  meet  the  real-time  constraints  required  by  many  applications.  In 
order  to  schedule  and  run  an  application  under  Maruti,  the  timing  and  resource  requirements 
for  that  application  must  be  determined.  The  development  environment  for  Maruti  consists 
of  several  tools  that  can  be  used  to  extract  and  analyze  these  requirements  [2]. 

The  Maruti  Programming  Language  (MPL)  is  a  language  developed  to  assist  users  in 
creating  applications  that  can  be  run  under  Maruti.  MPL  is  based  on  the  C  programming 
language,  and  assumes  the  programmer  is  familiar  with  C.  MPL  provides  some  additional 
constructs  that  are  not  part  of  standard  C  to  allow  for  resource  and  timing  specification  [1]. 
In  addition,  when  an  MPL  file  is  compiled,  some  of  the  resource  requirements  can  be 
recognized  and  recorded  to  an  output  file.  This  output  file  is  used  as  input  to  the  integration 
stage,  which  is  the  next  stage  in  the  development  cycle.  During  integration,  additional 
timing  requirements  may  be  specified. 

Previously,  an  MPL  file  w’as  compiled  by  first  running  the  source  code  through  the 
Maruti  pre-compiler,  which  created  a  C  file  that  was  then  compiled  using  a  C  compiler  [l). 
The  pre-compiler  extracted  the  necessary  information,  and  converted  the  MPL  constructs 
that  were  not  valid  C  statements  into  C  code.  This  required  the  additional  pass  of  the 
pre-compiler  over  the  source  code.  We  have  created  a  compiler  for  MPL  that  integrates 
both  the  actions  of  the  pre-compiler  and  the  compiler  into  one  stage.  In  this  report,  we 
present  MPL,  and  a  description  of  the  compiler  we  implemented.  Section  2  defines  the 
abstractions  used  in  Maruti.  In  Section  3,  the  syntax  of  the  constructs  unique  to  MPL  is 
defined.  The  details  of  the  implementation  of  the  compiler  are  given  in  Section  4.  Section  5 
describes  the  resource  information  that  is  recorded  during  compilation.  Conclusions  appear 
in  Section  6,  followed  by  an  Appendix  containing  a  sample  MPL  file,  and  the  resource 
information  recorded  for  that  file. 

2  Maruti  Abstractions 

An  MPL  application  is  broken  up  into  units  of  computation  called  elemental  units  (EUs). 
Execution  within  an  EU  is  sequential,  and  resource  and  timing  requirements  are  specified 
for  each  EU.  A  thread  is  a  sequential  unit  of  execution  that  may  consist  of  multiple  EUs. 
MPL  allows  threads  of  execution  to  be  specified  by  the  programmer  through  several  of  the 
constructs  provided.  A  task  consists  of  a  single  address  space,  and  threads  that  execute  in 
that  address  space. .  Modules  contain  the  source  code  of  the  application  as  defined  by  the 
programmer.  An  application  may  consist  of  several  modules.  During  execution,  modules 
are  mapped  to  one  or  more  tasks. 

3  MPL  Constructs 

There  are  several  constructs  defined  in  MPL  that  are  not  a  part  of  standard  C.  These 
constructs  have  been  implemented  in  the  MPL  compiler. 
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3.1  Module  Name  Specification 

A  module  may  consist  of  one  or  more  source  files  written  in  MPL.  At  the  start  of  each  MPL 
file,  the  name  of  the  module  that  the  source  file  corresponds  to  must  be  indicated.  This  is 
given  by  the  following  syntax:  . 

module-naae-spec  ::=  'nodule'  <aodule-aaae> 

The  nodule-aaae  may  be  any  valid  identifier  that  is  accepted  by  standard  C.  The  module 
name  specification  must  appear  at  the  beginning  of  the  source  file,  before  an}’  other  MPL 
code.  The  specification  is  not  compiled  into  any  executable  code.  It  is  simply  used  to 
indicate  the  module  that  the  functions  within  the  file  belong  to. 

3.2  Shared  Buffers 

A  shared  buffer  can  be  used  to  declare  memory  that  may  be  shared  by  several  tasks,  to 
permit  communication  between  the  tasks.  A  declaration  of  a  shared  buffer  requires  the  type 
be  defined  as  with  a  variable  declaration.  The  syntax  of  a  shared  declaration  is: 

shared-bull er-decl  'shared'  <type-specilier>  <shared-buller-name> . 

The  shared-bull er-nane  can  be  any  valid  identifier,  and  the  type-specifier  can  be  any 
valid  type  for  a  variable.  A  shared  declaration  is  compiled  as  a  pointer  to  the  type  given  in 
the  declaration  of  the  shared  buffer,  rather  than  the  type  given. 

3.3  Region  Constructs 

The  are  two  constructs  used  to  allow  for  mutual  exclusion  within  an  application. 

3.3.1  Region  Statement 

The  region  statement  is  used  to  enforce  mutual  exclusion  globally  throughout  an  entire 
application,  and  is  given  by  the  syntax: 

regior-suatemexrt  : :  =  'region'  <regioc-aaiiie> 

{  npl-stateaents  }. 

The  npi-stateaents  may  be  any  number  of  valid  MPL  statements.  These  statements 
make  up  a  critical  section. 

3.3.2  Local  Region  Statement 

The  locaLregion  statement  is  used  to  enforce  mutual  exclusion  within  a  task,  and  follows 
the  same  syntax  of  the  region  statement: 

local-regioa-suaueaent  ::=  'local.regiou ’  <local-regiou-naae> 

{  npl-staxeaents  }. 
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3.4  Channel  Declarations 


Channels  are  used  to  allow  for  message  passing  within  a  Maruti  application.  Each  channel 
declared  has  a  type  associated  with  it  given  by  a  valid  C  type- specifier.  This  type  indicates 
the  type  of  data  that  the  channel  will  carry. 

Channels  may  be  declared  in  both  entry  and  service  functions,  which  will  be  defined 
below.  The  syntax  for  channel  declarations  is: 

channel-declaration-list-opt  : :=  {  channel-declaration-list 

channel-declaration-list  ::=  channel-declaration  {  channel-declaration  >. 

channel-declaration  : : =  channel-type  channels  ’;*. 

channel-type  'out*  I  ’in’  I  ’in-first'  i  *  in-last’. 

channels  ::=  channel  {  ’,’  channel  }. 

channel  : :=  <channel-naae>  type-speciiier . 

3.5  Entry  Functions 

An  entry  function  is  a  special  type  of  function  that  may  be  defined  in  an  MPL  source  file. 
Each  entry  function  corresponds  to  a  thread  within  the  application.  The  syntax  for  an  entry 
function  definition  is: 

entry-function  : :  =  ’entry’  <entry-name>  *(*  ’)’  entry-function-body . 
entry-fnnction-body  ::=  channel-declaration-list-opt  npl -function-body . 

3.6  Service  Functions 

Service  functions  are  another  type  of  special  function  supported  by  MPL.  A  service  function 
is  invoked  when  a  message  is  received  from  a  client.  Each  service  function  definition  requires 
an  in  channel  and  message  buffer  be  included  in  the  definition.  The  service  function  will 
be  executed  when  there  is  a  message  on  the  channel  given  in  the  definition.  The  definition 
of  a  service  function  is  similar  to  that  of  an  entry'  function: 

service-function  ’service’  <service-name> 

’(’  <in-channel-na»e>  ’:’  type_specif ier  ’,’  <msg-ptr-nane>  ’)’ 
service-function-body . 

service-function-body  ::=  channel-declaration-list-opt  jnpl-function-body . 


3.7  Communication  Function  Calls 

There  are  several  library'  functions  used  to  allow  for  message  passing  within  a  Maruti  ap¬ 
plication. 

3.7.1  Send  Calls 

Each  call  to  the  send  function  must  specify  an  outgoing  channel  for  the  message: 

void  send  (  channel  channel_nane ,  void  *message_ptr  ); 


3.7.2  Receive  and  Optreceive  Calls 

Both  receive  calls,  and  optreceive  calls  must  be  associated  with  an  incoming  channel  (in, 
in.first,  or  in.lasi ): 

void  receive  (  channel  chanael.name,  void  »message_ptr  ); 
int  optreceive  (  channel  channel_naae ,  void  *message_ptr  ); 

A  call  to  receive  requires  that  there  be  a  message  on  the  incoming  channel.  Optreceive 
should  be  used  when  a  message  may  or  may  not  be  on  the  channel.  Optreceive  checks  for 
the  message,  and  returns  a  value  indicating  if  a  message  was  found. 

3.8  Initialization  Function 

Each  task  has  an  initialization  routine  that  is  executed  when  the  application  is  loaded.  This 
function  is  specified  by  the  user  with  the  following  name  and  arguments: 

int  maniti_main  (int  argc,  char  **argv) 


4  Implementation 

We  started  with  version  2.5.8  of  the  Gnu  C  compiler.  By  modifying  the  source  code  for 
the  C  compiler,  we  have  created  a  compiler  for  applications  written  in  MPL.  In  addition 
to  what  the  standard  Gnu  C  compiler  does,  this  modified  compiler  handles  the  additional 
constructs  defined  in  MPL,  and  records  information  about  the  source  code  that  is  needed 
by  Maruti.  A  source  code  file  written  in  MPL  is  specified  with  an  mpl  extension. 

4.1  Modifications  to  GCC  File  Structure 

In  the  process  of  modifying  the  compiler,  some  existing  files  were  modified.  In  addition, 
some  new-  files  were  also  created.  The  source  code  for  version  2.5.8  of  GCC  allows  compilers 
to  be  created  for  several  different  languages:  C,  C-f-f ,  and  Objective  C.  The  GCC  compile- 
uses  different  executable  files  for  the  different  languages  that  it  compiles.  There  are  separate 
files  for  C,  C-r-f,  and  Objective  C  (ccl,  cclplus,  cclobj).  The  GCC  driver,  gcc.c,  uses  the 
extension  of  the  source  file  specified  to  determine  the  appropriate  executable  (and  therefore 
language)  to  compile  the  source  file.  The  driver  then  executes  the  compiler,  passing  on  the 
appropriate  switches.  The  driver  was  modified  to  accept  input  files  with  an  mpl  extension. 
Cclmpl  is  the  new  executable  that  was  created  to  compile  MPL  source  files.  W^hen  a  file 
with  an  mpl  extension  is  specified  as  a  source  file  to  be  compiled,  this  new  executable  file  is 
used.  Wrhen  an  MPL  file  is  compiled,  it  automatically  passes  on  the  switch  -MarutLoutput, 
which  indicates  that  the  needed  output  should  be  recorded  io  A  ■file  with  an  ev  extension. 

The  executable  files  for  each  language  are  composed  of  many  object  files.  Some  of  these 
files  are  common  to  all  the  languages,  and  some  of  the  files  are  language-specific.  The 
language-specific  files  added  for  compiling  MPL  files  are  those  files  with  an  mpl-  prefix. 

Gperf  is  a  tool  used  to  generate  a  perfect  hash  function  for  a  set  of  words.  Gperf  is  used 
to  create  a  hash  function  for  the  reserved  words  for  each  language.  The  files  containing 
the  input  to  gperf  are  indicated  by  a  file  name  with  a  gperf  extension.  There  are  several 
different  * -gperf  files  containing  the  reserved  words  for  the  different  languages  recognized  by 
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the  compiler.  The  mpl-parse.gperf  file  contains  ail  the  reserved  words  for  C,  in  addition  to 
those  added  for  MPL.  For  each  language,  the  output  from  running  gperf  is  then  incorporated 
into  the  *-lez.c  file.  This  output  includes  a  function  isjreserved.word()  that  is  used  to  check 
if  a  token  is  a  reserved  word.  The  file  mpl-lez.c  is  basically  the  c-lez.c  filej  with  the  output 
of  running  gperf  on  mpl-parse.gperf  instead  of  c-parse. gperf. 

The  file  maruti.c  contains  the  routines  that  have  been  written  to  implement  MPL.  This 
file  is  linked  in  with  the  executable  for  all  of  the  languages,  to  prevent  undefined  symbol 
errors  from  occuring.  Calls  to  the  routines  contained  in  this  file  occur  in  both  the  language- 
specific,  and  the  common  files.  The  flag  marutLdump  is  set  in  mainQ  to  indicate  whether 
information  about  the  source  code  should  be  recorded  to  the  appropriate  output  file.  This 
flag  prevents  raik  to  the  routines  in  maruti.c  which  are  made  in  the  common  files  from 
occuring  for  the  languages  other  than  MPL.  The  files  containing  these  calls  are: 

•  calls,  c 

•  explow. c 

•  ezpr.c 

•  funclion.c 

•  toplev.c 

There  are  several  reasons  why  the  new  language-specific  files  have  to  be  created  for 
MPL.  The  files  mpl-lez.h  and  mpl-lez.c  needed  to  be  created  for  MPL  because  MPL  contains 
several  additional  reserved  words  not  present  in  C,  as  mentioned  earlier.  The  file  c-common.c 
relies  on  information  in  the  header  file  c-lex.h.  Since  MPL  uses  mpl-lez.h,  mpl-common. c 
includes  mpl-lez.h,  instead  of  c-lez.h.  Bison  is  a  tool  that  allows  a  programmer  to  define 
a  grammar  through  rules,  and  converts  them  into  a  C  program  that  will  parse  an  input 
file.  The  *-parse.y  files  are  the  bison  files  used  to  create  the  grammar  to  parse  a  source 
file.  Since  the  grammar  for  MPL  needed  to  be  modified  to  accept  the  additional  constructs, 
the  mpl-parse.y  file  was  created.  There  is  one  function  used  in  compiling  MPL  source  files 
that  is  defined  in  mpl-parse.y.  instead  of  maruti.c.  This  function  needed  to  access  the  static 
variables  declared  in  mpl-parse.y,  and  in  order  to  do  so.  the  function  definition  was  placed 
in  that  file.  Finally,  the  file  mpl-decl.c  was  created,  because  of  its  dependence  on  mpl-lez.h, 
and  also  to  allow  for  an  additional  type  specification  used  in  MPL. 

4.2  Compiling  MPL  Constructs 

MPL  extends  the  C  language  to  allow  for  various  constructs.  In  order  to  implement  these 
extensions,  the  grammar  used  to  recognize  C  in  GCC  had  to  be  extended.  The  following 
are  recognized  as  reserved  words  for  MPL,  in  addition  to  the  standard  reserved  words  for  C: 
shared,  region,  local.region,  module,  in,  out,  in.first,  in.last,  entry,  service,  send,  receive, 
and  optreceive.  The  keywords  in  and  out  were  reserved  words  in  the  c-*  files,  because 
they  axe  used  by  Objective  C,  but  in  MPL  they  are  used  as  channel  types.  In  addition  to 
the  new’  reserved  words,  rules  were  added  and  modified  resulting  in  the  rules  in  mpl-parse.y. 

4.2.1  Module  Name  Specification 

A  rule  was  added  to  the  grammar  to  parse  the  module  name  specification  in  an  MPL  file. 
The  rule  for  a  whole  program  was  also  modified  to  include  this  module  statement.  This 
rule  expects  the  module  statement  to  appear  before  any  other  definitions.  Since  the  module 
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name  specification  does  not  result  in  any  executable  code,  the  only  action  taken  is  to  record 
the  module  name  given  by  the  programmer. 

4.2.2  Shared  Buffers 

There  are  no  rules  added  to  the  grammar  for  a  shared  buffer  declaration.  When  a  variable 
declaration  is  parsed,  a  tree  is  created  that  keeps  track  of  all  the  specification  information 
given  for  that  declaration.  For  example,  typedef  and  extern  are  two  of  the  possible  type 
specifications.  The  token  shared  is  recognized  as  a  type  specification,  just  as  typedef  and 
extern  are  recognized.  When  a  declaration  is  made,  these  specifications  are  processed  in 
the  function  grokdeclarator( )  in  mpl-decl.c.  When  a  shared  specification  is  encountered, 
the  declaration  is  converted  to  a  pointer  to  the  type  specified,  instead  of  just  the  type 
specified.  Other  than  this  conversion  to  a  pointer,  the  declaration  is  compiled  just  as  any 
other  declaration  would  be  compiled  in  C. 

4.2.3  Region  Constructs 

The  region  constructs  are  considered  statements  in  MPL.  Several  rules  were  added  to  parse 
these  constructs,  and  the  region  and  local-region  statements  were  added  as  options  for  a 
valid  statement  in  the  grammar  for  MPL. 

Both  region  and  local-region  statements  are  compiled  in  the  same  manner.  Each  region 
has  a  name,  and  a  body  which  is  the  code  within  the  critical  section.  In  order  to  protect 
these  critical  sections,  calls  are  made  to  the  Maruti  library  function  maruti.ev().  When  a 
region  is  parsed,  the  compiler  generates  two  calls  to  maruli.eu(),  in  addition  to  the  code 
in  the  body  of  the  region.  The  first  call  is  generated  just  before  the  body,  and  the  second 
call  just  after.  These  calls  axe  generated  through  functions  in  maruti.c.  The  functions  axe 
based  on  the  actions  that  would  have  been  taken,  had  the  parser  actually  parsec  the  .-alU 
to  marutLeufJ  in  the  source  file. 

4.2.4  Channel  Declarations 

The  rules  added  for  a  channel  declaration  allow  any  number  of  channels  to  be  declared  in 
either  an  entry  or  a  service  function.  Each  channel  declaration  requires  several  pieces  of 
information: 

•  Channel-type 

•  Channel-name 

•  Type  specifier  indicating  the  type  of  data  that  channel  carries 

A  linked  list  of  declared  channels  is  maintained.  For  each  declared  channel  the  following 
information  is  saved: 

•  Channel-name 

•  Type  information 

1.  Size  in  bytes 

2.  Siring  encoding  the  type  of  the  data 

•  Channel-id 
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The  channel-id  is  a  unique  identification  number  assigned  to  each  declared  channel. 
Channel  declarations  do  not  add  to  the  compiled  code.  The  channels  are  not  allocated 
memory.  The  information  describing  each  channel  is  simple  stored  in  the  linked  list.  During 
compilation,  whenever  a  channel  is  referenced,  the  appropriate  information  is  obtained  from 
this  list. 


4.2.5  Entry  Functions 

Entry  function  definitions  are  compiled  differently  than  other  function  definitions.  An  entry 
function  would  appear  in  an  MPL  file  in  the  following  form: 

entry  <entry_nane>  () 

<cliaimel_declaxation_list_opt> 

i 

<apl_t nnct ion_body> 

} 


Where  entry  .name  is  an  identifier  that  is  the  name  of  the  entry  function,  the 
channel _declaration_listjopt  contains  any  channels  the  user  wants  to  define  for  that  func¬ 
tion,  and  mpl^unction_body  is  any  function  body  that  would  be  accepted  as  a  definition  in 
a  standard  MPL  function.  Semantically  the  entry  function  is  equivalent  to  the  following 
MPL  code: 

_mamti_entry_najne  () 

while(l) 

xcamci_eu() ; 
encry_naae  ( )  ; 

> 

> 

entry .name  () 
mpl_l  iinct  ion_body 

> 


An  entry  function  is  compiled  into  two  functions,  as  if  the  two  functions  given  above  had 
been  part  of  the  source  file.  Essentially,  the  first  function  is  just  a  stub  function  that  calls 
marutLexi(),  then  calls  the  second  function  compiled.  As  with  generating  function  calls, 
the  routines  to  generate  the  code  for  entry  function  definitions  are  based  on  the  actions 
that  would  have  been  taken  had  the  parser  actually  parsed  the  code  for  the  two  separate 
functions. 

4.2.6  Service  Functions 

Service  functions  definitions  are  handled  very  much  like  entry  function  definitions.  The 
syntax  of  a  service  function  differs  slightly  from  that  of  an  entry  function,  since  it  requires 
that  an  incoming  channel  and  a  message  buffer  be  defined: 
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service  <service_aaae>  (<in_channel_nane>  :  <type_specilier>,  <asg_T5tr_naae> ) 

<chaaael_declaxation_list_opt> 

i 

<Bpl_lunction_body> 

> 


Like  the  entry  functions,  service  functions  are  semantically  equivalent  to  two  functions, 
where  one  is  simply  a  stub  function  calling  the  second  function  that  is  generated: 

_BaxTiti_service_naae  () 

i 

type.specifier  _maruti_msg_ptr_najDe  ; 
shile(l ) 

il  (  optreeeive  (  _maruti_in  ,  id  ,  t  _jnaruti_BSg_ptr_naiae,  size  )  ) 
service_naBe  (t  _Baxuti_msg_ptr_name  ) ; 

> 

> 

> 

service_naae  (jnsg_ptr_name ) 
type_speci2ier  *Bsg_ptr_naBe ; 

< 

ttpl_2unctioz_body 

} 


The  servicejaaae ,  channeUeclarationJ.isc ,  and  Bpl_ltmctios._body  are  all  the  same 
as  described  previously  for  entry  functions.  In  addition,  service  functions  have  two  other 
items  specified  in  their  definitions.  The  first  is  a  channel.  Every  service  function  requires 
a  channel  be  specified.  This  channel  is  always  declared  as  an  in  channel  with  the  name 
ir-channel-naae .  The  type  is  given  by  typejspeciiier  as  if  it  had  been  declared  in  the 
channeUeclarataonJList.  The  channel  is  used  to  invoke  the  service  function.  This  in 
channel  is  used  by  the  optreeeive  in  the  stub  function  that  calls  the  function  containing  the 
service  function  body.  When  a  message  is  received  on  this  channel,  the  service  function  is 
executed.  The  second  additional  item  is  a  message  buffer  used  by  the  service  function.  The 
name  of  this  message  buffer  is  given  by  asg  jtr  jiaae ,  the  type  is  given  by  type_speci2ier. 
This  buffer  is  used  to  hold  the  message  received  from  the  client  that  invoked  the  service 
function,  and  is  passed  to  the  second  function  containing  the  body  of  the  service  function. 

4.2.7  Communication  Function  Calls 

There  were  three  library  functions  provided  for  message  passing  mentioned  previously:  send, 
receive,  and  optreeeive.  Function  calls  to  any  of  these  three  library  functions  are  handled 
differently  than  other  function  calls.  In  the  MPL  grammar,  send,  receive,  and  optreeeive 
are  all  reserved  words.  The  MPL  syntax  for  all  of  these  calls  is  the  following: 

<funcrion-naBe>  (<chcmne2-naBe> ,  <paraE>ete:—  2>) ; 
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Channel-name  should  be  a  previously  declared  channel,  and  parameter-2  should  be  a 
pointer.  These  function  calls  must  be  compiled  differently,  since  these  are  not  the  actual 
parameters  used  when  the  call  is  generated.  In  the  case  of  a  call  to  send,  the  actual 
parameters  must  be  as  follows: 

send  (<chamnel-id>,  <parameter-2> ,  <channel-size>) ; 

In  the  case  of  a  call  to  either  receive,  or  optreceive,  the  parameters  required  are: 

receive  I  optreceive  (<channel-type> ,  <channel-id> ,  <parameter-2> ,  <channel-size>) ; 

The  channel-type  for  a  receive  or  optreceive  call  is  an  integer  generated  by  the  compiler 
that  will  indicate  an  in,  in.first,  or  inJast  channel. 

When  one  of  these  three  function  calls  are  encountered,  there  are  special  rules  in  the 
grammar  to  handle  it.  A  function  in  maruti.c  is  called  which  generates  the  appropriate 
parameters,  and  then  the  function  call  itself.  These  function  calls  are  generated  as  men-  \ 
tioned  above  for  the  calls  to  marulLeuQ.  The  channel-name  specified  by  the  user  is  used 
to  obtain  the  necessary  parameters.  Given  the  channel  name,  the  linked  list  of  channels  is 
searched  to  find  the  corresponding  channel,  then  the  channel-id  and  the  channel-size  are 
obtained  from  that  node  in  the  linked  list.  There  is  also  some  type  checking  done  at  this 
stage.  The  compiler  verifies  that  only  an  outgoing  channel  is  specified  for  a  send  call,  or  an 
incoming  channel  for  the  receive  and  optreceive  calls.  The  compiler  also  checks  that  any 
channel  referenced  has  been  previously  defined. 

The  grammar  for  MPL  was  modified  so  that  a  call  to  any  of  the  communication  functions 
may  occur  anywhere  that  a  primary  expression  occurs,  since  that  is  where  other  function 
calls  are  permitted  to  occur. 

4.2.8  Initialization  Function 

The  user- defined  function  marutLmain()  is  compiled  as  an  ordinary  C  function. 

5  PEUG  File 

The  source  code  of  an  MPL  file  is  broken  up  into  elemental  units.  Each  elemental  unit 
identifies  the  resources  that  it  requires.  These  elemental  units  are  used  later  in  the  develop¬ 
ment  process  for  scheduling  the  application.  The  output  file  created  by  the  MPL  compiler 
creates  a  Partial  Elemental  Unit  Graph  (PEUG)  for  the  given  source  file.  The  name  of  this 
file  is  the  name  of  the  source  hie,  with  the  mpl  extension  replaced  by  an  eu  extension. 

There  are  several  different  types  of  information  recorded  in  this  PEUG  file. 

5.1  Module  Name 

The  first  line  in  the  output  file  indicates  the  name  of  the  module,  and  will  appear  as: 
peug  <aodule-aa»e> 

The  oodule-naae  is  taken  directly  from  the  module  name  specification  given  in  the  MPL 
source  file. 


5.2  File  Name 


The  second  line  in  the  source  file  indicates  the  name  of  the  target  file  that  is  created  by  the 
compiler,  where  file-name  is  the  target: 

file  <file-name> 

5.3  Shared  Buffers 

Each  time  a  shared  buffer  is  declared  its  name  and  type  information  is  recorded  to  the 
output  file: 

shared  <shared-buflex— name>  :  (type-description-string> ,  <type-size>) 

The  type-descript  ion-string  and  type-size  of  a  shared  buffer  is  obtained  from  the 
type  specification,  and  is  represented  in  the  same  manner  as  the  type  and  size  for  a  chan¬ 
nel.  Although  the  shared  buffer  is  actually  a  pointer  to  the  type  it  is  declared  as,  the 
type-description-string  represents  the  object  being  pointed  to,  and  not  the  pointer  itself. 

5.4  Entry,  Service,  and  User  Function  Definitions 

In  MPL,  a  user  may  define  ordinary  functions  in  addition  to  the  entry  and  service  functions 
that  are  permitted  in  MPL.  For  each  entry,  service  or  ordinary  user-defined  function,  there 
is  an  entry  in  the  output  file.  This  entry  has  the  following  format: 

<function-type>  <function-name> 


size  <stack-size> 

Function-type  can  be  either  function,  entry,  or  service,  indicating  which  type  of  function 
is  being  defined.  Function-name  is  the  declared  name  of  the  function  in  the  source  file. 
Stack-size  is  the  maximum  stack  size  needed  by  this  function.  This  stack-size  includes  the 
arguments  pushed  onto  the  stack  preceding  any  function  rails  occuring  within  the  function 
body.  There  will  also  be  other  information  concerning  the  body  of  the  function  that  will 
appear  between  the  function-name ,  and  the  stack-size.  The  entry  for  the  maruti-mainQ 
function  will  be  the  same  a£  those  for  other  user  defined  functions.  Entry  and  service 
functions  will  contain  some  additional  information  not  applicable  to  ordinary  functions 
that  will  be  described  below. 

5.4.1  Channels 

For  each  channel  that  is  declared,  a  description  of  the  channel  is  written  to  the  output  file. 
These  descriptions  will  occur  right  after  the  statement  indicating  the  name  of  the  current 
function: 


<channel-type>  <name>  :  (<description-string> ,  <size>) 


The  channel-type  and  channel-name  will  be  the  type  and  name  specified  in  the  source 
file.  The  description-string  and  size  are  based  on  the  type  specification  in  the  channel 
declaration.  Channel  descriptions  will  occur  only  in  entry  and  service  functions.  A  service 
function  will  always  contain  at  least  one  channel  description,  since  the  syntax  of  a  service 
function  requires  a  channel  be  named  in  the  definition.  A  channel  description  will  also  be 
output  for  ever}'  send,  receive,  and  optreceive  call,  since  these  calls  require  a  channel  as  one 
of  their  parameters. 

5.4.2  Function  Calls 

Each  time  a  function  call  is  parsed,  there  will  be  a  line  in  the  output  file: 

calls  Cfunction-name>  {in_cond>  {in.loop} 

This  line  indicates  where  a  function  call  occurs,  and  which  function  is  being  called.  The 
in_cond  and  inJloop  indicate  if  this  function  call  appears  within  a  conditional  statement  or 
within  a  loop.  These  labels  will  be  seen  only  if  their  respective  conditions  are  true. 

5.4.3  Communication  Function  Calls 

Any  call  to  a  communication  function  is  recorded  similarly  to  other  function  calls.  There  is 
a  line  indicating  the  name  of  the  function,  as  shown  above  for  a  function  call.  In  addition, 
there  will  be  aline  describing  the  channel  associated  with  that  communication  function  call. 
This  line  will  appear  just  as  the  line  for  the  channel  definition  described  above  appears. 

5.4.4  EU  Boundaries 

The  output  file  for  an  MPL  source  file  indicates  where  each  elemental  unit  (EU)  begins  by 
the  following: 

eu  <E>  {region.list} 

The  N  indicates  an  EU  number.  Each  EU  within  a  source  file  has  a  unique  number. 
There  are  several  places  where  EU  boundaries  are  created: 

•  Start  of  a  function 

•  Start  of  a  region 

•  End  of  a  region 

•  Explicit  calls  to  maruti.eu() 

The  initial  EU  occuring  at  the  beginning  of  a  function  that  is  not  a  service  or  entry  function 
is  a  special  case.  This  is  always  labeled  as  “eu  0”  in  the  output  file,  and  does  not  represent 
an  actual  EU. 

Each  EU  may  also  be  followed  by  a  list  describing  one  or  more  regions.  This  list 
represents  the  regions  that  this  EU  occurs  within.  The  description  of  a  region  appears  as: 

(region-name  instance  access  type) 

The  region-name  is  just  that  given  by  the  user,  and  the  type  indicates  if  a  region  is  local 
(local-region  construct)  or  global  (region  construct).  The  access  indicates  if  the  access  is 
read  or  write.  The  instance  indicates  the  instance  of  this  region  within  the  source  file. 
Each  instance  for  a  region  within  a  source  file  is  unique. 
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6  Conclusions 


Basing  MPL  on  C  has  simplified  the  development  of  both  the  language  and  its  compiler. 
The  language  is  easy  to  learn  for  any  programmer  that  has  used  C  before,  since  there 
are  a  limited  number  of  additional  constructs  unique  to  MPL.  Using  the  GCC  C  source 
code  provided  an  existing  compiler,  rather  than  implementing  a  new  one.  The  source  code 
for  GCC  only  needed  to  be  modified  to  handle  some  additional  constructs,  and  produce 
some  additional  output.  This  made  the  implementation  fairly  simple.  However,  the  GCC 
C  compiler  also  provides  some  functionality  that  is  not  needed  by  MPL.  Much  of  this 
functionality  provided  is  not  even  permitted.  These  restrictions  are  not  enforced  by  the 
compiler,  but  should  be  detected  within  the  development  cycle. 

Prior  to  the  development  of  the  MPL  compiler  using  GCC,  compiling  an  MPL  source 
file  required  two  steps.  The  source  files  were  initially  passed  through  a  pre-compiler  to 
extract  the  available  resource  information  and  parse  the  MPL  constructs.  The  pre-compiler 
was  responsible  for  converting  the  MPL  code  into  valid  C  code,  which  was  then  compiled 
using  a  standard  C  compiler.  The  new  implementation  of  the  compiler  eliminates  some 
of  the  redundant  processing  that  is  done  when  the  pre-compiler  is  used.  The  information 
obtained  through  the  pre-compiler  already  existed  in  the  internal  structure  used  by  the  GCC 
compiler.  This  information  just  needed  to  be  recorded.  Instead  of  per^rng  source  code  files 
in  the  two  steps  independently,  the  functionality  of  the  pre-compiler  has  been  incorporated 
into  the  compiler  itself.  The  MPL  compiler  provides  a  single  tool  that  extracts  all  the 
available  information  at  the  initial  stage  of  develpment. 

In  the  future,  a  version  of  MPL  may  be  implemented  that  is  based  on  the  Ada  pro¬ 
gramming  language.  GNAT  is  a  compiler  for  Ada  9X  that  is  being  developed  at  NYU. 
GNAT  depends  on  the  backend  of  the  GCC  compiler.  Using  the  source  code  for  GNAT, 
an  implementation  of  MPL  based  on  Ada  would  be  similar  to  the  current  implementation 
based  on  C. 
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Appendix 


A  MPL  File 

The  following  is  a  sample  of  MPL  source  code: 

module  timer; 

typedef  struct  { 
ist  seconds; 
int  minutes; 
int  hours; 

>  time_type ; 

shared  time_type  global_time; 

maruti_maj.n(argc ,  argv) 
int  argc; 
char  **argv; 

{ 

global_time->seconds  =  0; 
global_time->minutes  *  0; 
global_tiae->hours  =  0; 

return  0; 

> 

entry  update_second() 
out  cist  :  time_type; 

< 

time_type  msg; 

region  time.region  { 

global_time->seconds^-+ ; 
il  (global_time->seconds  ==  60) 
global_time->seconds  =  0; 
msg  =  *global_time; 

> 

send  (disp,  tasg); 

> 

entry  update_ainnte() 
out  display  :  time_type; 

time.type  msg; 

region  time_region  { 

global_time->minutes++ ; 
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il  (global_tiBe->minutes  ==  60) 
global _time->minutes  =  0; 
msg  =  »global_time; 

> 

send  (display,  Jonsg) ; 

> 

entry  npdate_honr() 

ont  display  :  time.type; 

< 

time.type  msg; 

region  time_region  { 

global_time->bonrs++ ; 
il  (global_time->honrs  ==  24) 
global_time->haurs  =  0; 
msg  =  *global_time; 

> 

send  (display,  tansg); 

> 

service  display_time(incban  :  time_type ,  time) 

i 

printl (" Current  Time:  7.d  :  7.d  :  y.d",  time->honrs,  time->minntes,  time->seconds) ; 

> 
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B  PEUG  File 

The  corresponding  PEUG  file  for  the  source  code  above  is: 

peug  titter 
lile  titter. o 

shared  globaI_tine  :  ($(iii),  12) 

Inaction  naruti_nain 
eu  0 
size  4 

entry  update_ second 

out  disp  :  ($(iii),  12) 
eu  2 

eu  3  (tine_region  1  V  global) 
calls  maruti_eu 
eu  4 

calls  naruti_eu 
calls  send 

out  disp  :  ($(iii),  12) 

size  32 

entry  update_minute 

out  display  :  ($(iii),  12) 
eu  5 

eu  6  (tiae_region  2  V  global) 
calls  naruti_en 

eu  7 

calls  aaxuti_eu 
calls  send 

out  display  :  ($(iii),  12) 

size  32 

entry  update_hour 

out  display  :  (S(iii),  12) 
eu  8 

eu  9  (tine_region  3  V  global) 
calls  maruti.eu 

eu  10 

calls  ttaruti_eu 
calls  send 

out  display  :  ($(iii),  12) 

size  32 

service  display_tine 

in  inchan  :  (S(iii),  12) 
eu  11 

calls  optreceive 
in  inchan  :  (S(iii),  12) 
calls  print! 

size  32 
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Chapter  1 
Introduction 


The  Maruti  Programming  Language  (MPL)  is  used  to  write  Maruti  application  code. 
Currently,  MPL  is  based  on  the  ANSI  C  programming  language,  with  extensions  to 
support  modules,  real-time  constructs,  communications  primitives,  and  shared  memory. 

The  Maruti  Configuration  Language  (MCL)  is  used  to  specify  how  individual 
program  modules  are  to  be  connected  together  to  form  an  application  and  the  details  of  the 
hardware  platform  on  which  the  application  is  to  be  executed. 

1.1.  General  Program  Organization 


A  complete  Maruti  system  is  called  an  application.  Applications  can  be  large,  distributed 
systems  made  up  of  many  subsystems.  Each  application  is  defined  by  a  configuration  file , 
which  defines  all  the  subsystems  and  their  interactions.  The  following  entities  make  up  an 
application: 


Jobs  Jobs  are  the  active  entities  in  a  Maruti  application.  Jobs  are  specified  in  the 

configuration  file  with  timing  constraints,  including  the  job  period.  A  job  is  made 
up  of  multiple  entry  points,  which  are  the  threads  of  execution  that  will  be  run  for 
the  job. 

Modules  The  code  of  an  application  is  divided  into  modules.  Each  module  consists  of 
entry  points,  which  define  the  code  which  will  be  executed  as  part  of  a  job, 
services,  which  define  code  to  be  invoked  on  behalf  of  a  client  module,  and 
functions,  which  are  called  from  entries  and  services. 

Tasks  At  run-time,  modules  map  to  tasks  (a  module  may  be  mapped  to  more  than  one 

task).  Each  task  consists  of  an  address  spac  and  threads  of  execution  for  the  entry 
points  and  services  of  the  module. 

Channels  Channels  are  the  communication  paths  for  Maruti  applications.  Each  channel  is 
a  one-way  connection  through  which  typed  messages  are  passed.  The  end  points 
are  defined  by  out  and  in  channel  specifiers,  and  are  connected  as  specified  in  the 
application  configuration  file.  Each  end  point  is  associated  with  one  entry  or 
service,  and  its  message  type  and  channel  type  are  declared  within  the  entry  or 
service  header.  The  types  of  the  in  and  out  channel  specifiers  must  match. 

Regions  Regions  are  the  mechanism  for  mutual  exclusion  between  Maruti  threads:  only 
one  thread  can  enter  a  particular  region  at  a  time.  Two  types  of  regions  may  be 
specified:  global  regions  enforce  exclusion  for  the  entire  Maruti  application,  while 
local  regions  enforce  exclusion  only  within  a  single  task. 

Shared  buffers  Named  memory  buffers  can  be  shared  between  tasks.  The  buffer  is 
mapped  into  the  address  space  of  each  task  that  uses  that  buffer. 
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1.1.1.  Maruti  Programming  Language 

Rather  than  develop  completely  new  programming  languages,  we  have  taken  the  approach 
of  using  existing  languages  as  base  programming  languages  and  augmenting  them  with 
Maruti  primitives  needed  to  provide  real-time  support 

In  the  current  version,  the  base  programming  language  used  is  ANSI  C.  MPL  adds 
modules,  shared  memory  blocks,  critical  regions,  typed  message  passing,  periodic 
functions,  and  message-invoked  functions  to  the  C  language.  To  make  analyzing  the 
resource  usage  of  programs  feasible,  certain  C  idioms  are  not  allowed  in  MPL;  in 
particular,  recursive  function  calls  are  not  allowed  nor  are  unbounded  loops  containing 
externally  visible  events,  such  as  message  passing  and  critical  region  transitions. 

•  The  code  of  an  application  is  divided  into  modules.  A  module  is  a  collection  of 
procedures,  functions,  and  local  data  structures.  A  module  forms  an  independently 
compiled  unit  and  may  be  connected  with  other  modules  to  form  a  complete  application. 
Each  module  may  have  an  initialization  function  which  is  invoked  to  initialize  the 
module  when  it  is  loaded  into  memory.  The  initialization  function  may  be  called  with 
arguments. 

•  Communication  primitives  send  and  receive  messages  on  one-way,  typed  channels. 
There  are  several  options  for  defining  channel  endpoints  that  specify  what  to  do  on 
buffer  overflow  or  when  no  message  is  in  the  channel.  The  connection  of  two  end¬ 
points  is  done  in  the  MCL  specification  for  the  application  -  Maruti  insures  that  end¬ 
points  are  of  the  same  type  and  are  connected  properly  at  runtime. 

•  Periodic  functions  define  entry  points  for  execution  in  the  application.  The  MCL 
specification  for  the  application  will  determine  when  these  functions  execute. 

•  Message-invoked  functions,  called  services,  are  executed  whenever  messages  are 
received  on  a  channel. 

•  Shared  memory  blocks  can  be  declared  inside  modules  and  are  connected  together  as 
specified  in  the  MCL  specifications  for  the  application. 

•  Critical  Regions  are  used  to  safely  maintain  data  consistency  between  executing 
entities.  Maruti  ensures  that  no  two  entities  are  scheduled  to  execute  inside  their  critical 
regions  at  the  same  time. 

1.1.2.  Maruti  Configuration  Language 

MPL  Modules  are  brought  together  into  as  an  executable  application  by  a  specification  file 
written  in  the  Maruti  Configuration  Language  (MCL).  The  MCL  specification  determines 
the  application's  hard  real-time  constraints,  the  allocation  of  tasks,  threads,  and  shared 
memory  blocks,  and  all  message-passing  connections.  MCL  is  an  interpreted  C-like 
language  rather  than  a  declarative  language,  allowing  the  instantiation  of  complicated 
subsystems  using  loops  and  subroutines  in  the  specification.  The  key  features  of  MCL 
include: 
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•  Tasks,  Threads,  and  Channel  Binding.  Each  module  may  be  instantiated  any 
number  of  times  to  generate  tasks.  The  threads  of  a  task  are  created  by  instantiating  the 
entries  and  services  of  the  corresponding  module.  An  entry  instantiation  also  indicates 
the  job  to  which  the  entry  belongs.  A  service  instantiation  belongs  to  the  job  of  its 
client  The  instantiation  of  a  service  or  entry  requires  binding  the  input  and  output  ports 
to  a  channel.  A  channel  has  a  single  input  port  indicating  the  sender  and  one  or  more 
output  ports  indicating  the  receivers.  The  configuration  language  uses  channel  variables 
for  defining  the  channels.  The  definition  of  a  channel  also  includes  the  type  of 
communication  it  supports,  i.e.,  synchronous  or  asynchronous. 

•  Resources.  All  global  resources  (i.e.,  resources  which  are  visible  outside  a  module) 
are  specified  in  the  configuration  file,  along  with  the  access  restrictions  on  the  resource. 
The  configuration  language  allows  for  binding  of  resources  in  a  module  to  the  global 
resources.  Any  resources  used  by  a  module  which  are  not  mapped  to  a  global  resource 
are  considered  local  to  the  module. 

•  Timing  Requirements  and  Constraints.  These  are  used  to  specify  the  temporal 
requirements  and  constraints  of  the  program.  An  application  consists  of  a  set  of 
cooperating  jobs.  A  job  is  a  set  of  entries  (and  the  services  called  by  the  entries)  which 
closely  cooperate.  Associated  with  each  job  are  its  invocation  characteristics,  i.e., 
whether  it  is  periodic  or  aperiodic.  For  a  periodic  job,  its  period  and,  optionally,  the 
ready  time,  and  deadline  within  the  period  are  specified.  The  constraints  of  a  job  apply 
to  all  component  threads.  In  addition  to  constraints  on  jobs  and  threads,  finer  level 
timing  constraints  may  be  specified  on  the  observable  actions.  An  observable  action 
may  be  specified  in  the  code  of  the  program.  For  any  observable  action,  a  ready  time 
and  a  deadline  may  be  specified.  These  are  relative  to  the  job  arrival.  An  action  may  not 
start  executing  before  the  ready  time  and  must  finish  before  the  deadline.  Each  thread  is 
an  implicitly  observable  action,  and  hence  may  have  a  ready  time  and  a  deadline. 

Apart  from  the  ready  time  and  deadline  constraints,  programs  in  Maruti  can  also  specify 
relative  timing  constraints,  those  which  constrain  the  interval  between  two  events.  For  each 
action,  the  start  and  end  of  the  action  mark  the  observable  events.  A  relative  constraint  is 
used  to  constrain  the  temporal  separation  between  two  such  events.  It  may  be  a  relative 
deadline  constraint  which  specifies  the  upper  bound  on  time  between  two  events,  or  a  delay 
constraint  which  specifies  the  lower  bound  on  time  between  the  occurrence  of  the  two 
events.  The  interval  constraints  are  closer  to  the  event-based  real-time  specifications,  which 
constrain  the  minimum  and/or  maximum  distance  between  two  events  and  allow  for  a  rich 
expression  of  timing  constraints  for  real-time  programs. 

•  Replication  and  Fault  Tolerance.  At  the  application  level,  fault  tolerance  is  achieved 
by  creating  resilient  applications  by  replicating  part,  or  all,  of  the  application.  The 
configuration  language  eases  the  task  of  achieving  fault  tolerance,  by  allowing 
mechanisms  to  replicate  the  modules,  and  services,  thus  achieving  the  desired  amount 
of  resiliency.  By  specifying  allocation  constraints,  a  programmer  can  ensure  that  the 
replicated  modules  are  executed  on  different  partitions. 
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Chapter  2 


Tutorial 

2.1.  Basic  Maruti  Program  Structure 

Marati  applications  are  built  up  out  of  one  or  more  MPL  modules,  and  tied  together  with  a 
configuration  file  written  in  MCL.  Well  start  our  tutorial  with  an  explanation  of  a  very 
simple  application  consisting  of  one  module,  called  simple.mp.  Our  simple  application  will 
contain  a  producer  thread  that  sends  out  integer  data,  and  a  consumer  thread,  which 
receives  integer  values  and  prints  them  out 

The  Module 


module  simple; 


int  data; 


maruti_main(int  argc,  char  **argv) 

{ 

if(argc  <  1)  { 

printf("simple:  requires  an  integer  argument\n"); 
return  1; 

} 

data  =  atoi(argv[0]); 
return  0; 


This  first  part  of  the  module  will  be  similar  in  all  Maruti  modules.  The  module 
always  starts  with  the  module  name  declaration.  After  the  module  declaration,  the  MPL 
module  is  much  like  any  ANSI  C  program,  but  with  some  special  Maruti  definitions. 

Every  module  must  contain  a  function  named  maruti_main,  which  initializes  the 
module  at  load  time.  This  initialization  would  normally  include  things  like  device  probing 
or  painting  the  screen.  The  maruti_main  function,  exactly  like  the  main  function  of  a  C 
program,  takes  an  argument  count  and  list  as  its  parameters,  and  returns  an  error  code  to  its 
environment  In  Maruti,  the  environment  is  the  system  loader,  and  any  non- zero  return 
results  in  a  load  failure,  in  which  case  the  application  will  not  run.  In  our  example, 
maruti_main  is  responsible  for  setting  the  initial  value  of  our  datum  from  the  environment 
and  returning  a  failure  code  if  there  is  no  argument . 
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Periodic  Functions 


entry  producerO 
out  och:  int; 

{ 

data++;  /*  produce  data  */ 

send(och,  &data); 

} 


The  producer  is  a  periodic  function,  or  Maruti  entry  point.  It  serves  as  the  top-level 
function  for  a  Maruti  thread  that  will  be  invoked  repeatedly,  with  a  period  specified  in  the 
MCL  config  file  (which  we  will  see  below). 

The  producer  outputs  its  data  on  a  Maruti  channel,  using  the  built  in  MPL  send 
function.  The  channel  och  is  declared  as  part  of  the  function  header  of  producer.  Maruti 
channels  are  declared  to  have  a  type,  usually  a  structure  but  in  this  case  a  simple  integer. 
All  messages  sent  on  the  channel  will  be  of  the  same  type. 

Note  that  there  is  no  open,  bind,  or  connect  statement  needed  to  initiate 
communication  on  the  channel.  The  connection  of  the  channel  will  be  specified  in  die 
config  file,  and  initiated  automatically  by  the  runtime  system. 

Message-Invoked  Functions 


service  consumer(ich:  int,  msg) 

{ 

printf("consumer  got  %d\n",  *msg);  I*  consume  data  */ 

} 


The  consumer  is  a  message-invoked  function,  or  Maruti  service.  It  serves  as  the 
top-level  function  for  a  Maruti  thread  that  is  invoked  whenever  there  is  a  message  delivered 
on  the  channel  declared  in  the  function  header.  The  msg  parameter  is  the  name  of  the 
pointer  to  the  message  buffer  that  will  contain  the  delivered  message. 

Since  the  receipt  of  the  invoking  message  is  automatic  for  a  Maruti  service,  the  only 
thing  our  consumer  has  to  do  is  print  out  the  data  value  contained  in  the  message. 

This  completes  our  simple  module,  but  in  order  to  have  a  Maruti  application,  we 
must  have  a  config  file  that  tells  the  system  how  to  run  our  program. 

The  Config  File 

The  config  file  is  written  in  the  Maruti  Configuration  Language  (MCL),  an  interpreted  C- 
like  language  with  constructs  that  allow  an  application  to  be  built  up  from  pieces  and 
interconnected.  The  MCL  processor,  called  the  integrator,  builds  a  program  graph  from 
the  specifications,  analyses  it  for  type  correctness  and  completeness,  and  checks  for 
dependency  cycles.  Here  is  the  config  file,  simple. cfg,  that  goes  with  our  application  : 
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application  simple  { 


j  ob  j ;  /*  declare  variables  */ 

task  si; 
channel  c; 

init  j:  period  1  s;  /*  specify  job  parameters  *1 
start  si:  simple(27);  /*  specify  task  parameters  */ 


o  si.producer  <c>  in  j;  /*  producer  thread  */ 
<c>  si.consumer  <>;  /*  consumer  thread  */ 

} 


The  variables  in  MCL  correspond  to  the  objects  that  make  up  an  application,  such 
as  channels,  tasks,  and  jobs.  As  in  C,  these  variables  must  be  declared  before  they  are 
used. 

In  Maruti,  a  job  is  a  logical  collection  of  threads  that  run  with  the  same  period.  AD 
entry  functions  in  die  application  must  be  put  in  some  job.  The  init  statement  sets  the 
period  for  a  particular  job.  In  our  case,  the  job  j  will  run  once  every  second. 

A  task  is  the  runtime  instantiation  of  an  MPL  module,  just  as  in  Unix  a  process  is 
the  runtime  image  of  a  program.  Many  tasks  may  be  executed  from  the  same  module,  each 
will  run  independendy  in  die  Maruti  application.  The  MCL  start  command  instantiates  a 
task  from  a  module.  In  our  example,  we  instantiate  one  task  from  the  module  simple  and 
pass  it  the  initial  data  value  of  27. 

We  instantiate  the  threads  for  the  entry  and  service  functions  inside  a  particular  task, 
with  particular  input  and  output  channels.  In  our  example,  the  statement 


o  si. producer  <c>  in  j;  /*  producer  thread  */ 

instantiates  the  si.producer  thread  in  job  j  with  no  input  channels  and  one  output  channel, 
c.  Likewise,  the  statement 

<c>  sixonsumer  <>;  I*  consumer  thread  */ 

instantiates  the  si.consumer  thread  with  one  input  channel  c,  and  no  output  channels 
Service  functions  are  not  put  in  a  job,  but  rather  inherit  the  scheduling  characteristics  of  the 
thread  that  is  sending  to  their  invoking  channel. 

The  integrator  checks  to  insure  that  the  use  of  producer  and  consumer  in  the  config 
file  match  the  declarations  in  the  program  module. 

Building  and  Running  the  Application 

We  can  build  the  simple  application  by  putting  simple.mpl  and  simple.cfg  in  a  directory, 
and  running  the  mbuild  command  there: 
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%ls 

simple.cfg  simple.mpl 
%  mbuild 

mbuild:  extracting  module  info  from  MCL  file  ‘simple.cfg’ 
mbuild:  creating  obj  subdirectory  for  output  files, 
mbuild:  generating  obj/simple-build.mk 
mbuild:  running  make  -f  obj/simple-build.mk 


Mbuild  takes  care  of  running  the  MPL  compiler,  the  MCL  integrator,  as  well  as  the 
analysis  and  binding  programs  needed  to  build  the  runnable  Maruti  application.  By  default, 
mbuild  creates  both  a  stand-alone  binary  that  can  be  booted  on  the  bare  machine,  and  a 
Unix  binary  that  runs  in  virtual  real  time  from  within  the  Unix  development  environment 
These  different  versions  of  the  runtime  system  are  called  flavors. 

We  can  try  out  the  simple  application  by  running  the  ux+xll  flavor  from  the 
command  line: 


%  obj/simple.ux+xl  1 
<...  startup  messages  ...> 
consumer  got  28 
consumer  got  29 
consumer  got  30 
consumer  got  31 
consumer  got  32 
consumer  got  33 
consumer  got  34 
consumer  got  35 
consumer  got  36 
consumer  got  37 


application  quit 


The  application  boots  up  and  outputs  the  consumer  message  once  every  second. 
We  can  exit  the  application  by  typing  ‘q\ 

22.  Using  the  Graphics  Library 

Many  Maruti  programs  will  want  to  use  the  graphical  screen  as  a  monitor  for  an  embedded 
system,  producing  oscilloscope  or  bar-graph  style  displays,  or  for  animating  a  simulation 
or  demonstration.  Maruti  provides  a  console  graphics  library  as  an  integral  part  of  the 
system  to  make  the  development  of  visually  oriented  applications  simpler.  Our  next 
example  application,  clock,  demonstrates  the  use  of  the  graphics  library  as  well  as  the  use 
of  multiple  jobs  to  take  advantage  of  Maruti's  scheduling  abilities. 
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Figure  2.1:  Dsiplay  of  clock  example  application 

The  clock  application  will  display  a  circular  clock  face  on  the  screen,  with  the  hour, 
minute,  and  second  hands  moving  as  independent  Maruti  threads  in  different  jobs.  The 
clock  screen  is  shown  in  Figure  2.1. 

We  will  now  go  through  the  clock.mpl  module  and  see  how  it  works. 


module  clock; 

#include  <maruti/mtime.h> 

#include  <maruti/console.h> 

#include  <math.h> 

#include  "clock.h" 

#defme  CENTER_X  (CONSOLE_WIDTH/2)  /*  useful  constants  */ 

#defme  CENTER_Y  (CONSOLE_HEIGHT/2) 

void  check_for_quitkey(void);  /*  subroutines  */ 

void  polar_point(int  pos,  int  radius,  int  *x,  int  *y); 
void  xor_triangle(int  pos,  int  apex_radius4nt  color); 
void  xor_ray(int  pos,  int  color); 

int  sec_pos,  min_pos,  hour_pos;  /*  system  state  */ 
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The  first  part  of  the  module  is  much  like  any  other  ANSI-C  program,  with  # 
includes,  #  defines,  and  function  prototype  declarations  for  subroutines  to  be  used  later  in 
the  program.  Notice  the  two  Maruti  header  files  included:  <maruti/mtime.h>  contains 
declarations  related  to  Maruti  time  management,  and  <maruti/console.h>  contains 
declarations  that  define  the  graphics  library  interface.  The  " clockh "  header,  which  we'll 
see  below,  will  contain  definitions  that  customize  the  look  of  the  clock  face. 


maruti_mainO 

{ 

int  i,  xl,  yl,  x2,  y2,  color; 
char  mim_str[4]; 
mtime  curtime; 
mdate  curdate; 


/*  initialize  screen  library,  paint  screen  black  */ 


cons _graphics_initO; 

cons_fill_area(0, 0,  CONSOLE. WIDTH,  CONSOLE.HEIGHT,  BLACK); 


The  maruti.main  function  in  the  clock  application  draws  the  clock  face  display  and 
initializes  the  system  state  -  in  our  case,  the  positions  of  the  three  clock  hands.  Before 
drawing  on  the  screen,  the  application  must  call  cons_graphics_init,  and  initialize  the 
contents  of  the  screen.  The  call  to  cons_fill_area  does  this  by  filling  the  entire  screen  with 
the  color  BLACK 


/*  draw  tick  marks  for  clock  face  */ 


for(i  =  0;  i  <  60;  i++)  { 

polar_point(i*POS_PER_TICMARK,  OUTER.RADIUS,  &xl,  &yl); 
polar_point(i*POS_PER_TICMARK,  INNER.RADIU S,  &x2,  &y2); 

if(i  %  5)  color  =  GRAY; 
else  color  =  WHITE; 


cons_draw_line(xl,  yl,  x2,  y2,  color); 

} _ 

The  step  in  initialization  is  to  draw  the  tick  marks  for  the  clock  face.  There  will  be 
sixty  tick  lines  drawn  around  the  circle,  one  for  each  second.  Every  fifth  tick  mark  will  be 
WHITE  to  mark  the  hour  positions,  and  the  rest  will  be  GRAY.  The  lines  are  drawn  using 
the  cons_draw_line  library  routine,  which  draws  a  one-pixel-wide  line  between  two  points 
in  the  desired  color. 

The  location  of  the  endpoints  of  our  tick  marks  are  calculated  using  a  helper  routine, 
polar.point  (shown  below),  which  calculates  the  cartesian  coordinates  for  a  given  angle 
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and  radius.  We  conveniently  adopt  integer  angle  positions  starting  from  0  at  the  top, 
clockwise  around  up  to  60*POS_PER_TICMARK  back  at  the  top  again. 


/*  draw  numerals  for  clock  face  *1 

for(i  =  1;  i  <=  12;  i++)  { 
sprintf(num_str,  "%d”,i); 

polar_point(i*5 *POS_PER_HCMARK,  NUMBER.RADIUS,  &xl,  &yl); 
yl  -=  8;  xl  -=  strlen(num_str)*8  /  2;  /*  center  the  string  */ 
cons_print(xl,  yl,  num_str,  strlen(num_str),  YELLOW); 

} _ 

The  numerals  are  placed  on  the  clock  face  similarly  to  the  tick  marks.  The 
cons_print  graphics  library  function  places  text  on  the  screen  at  a  given  position  and  color. 


/*  initialize  the  hand  positions  to  current  time  */ 

maruti_get_current_time(&curtime) ; 
curdate  =  maruti_time_to_date(curtime); 

sec_pos  =  curdate.second  *  POS_PER_TICMARK; 
min_pos  =  curdate.minute  *  POS  JPERJTICMARK  + 
curdate.second  *  POS_PER_TICMARK  /  60  ; 
hour_pos  =  (curdate.hour  %  12)  *  5  *  POS_PER_TICMARK  + 
curdate.minute  *  5  *  POS_PER_TICMARK  /  60; 


return  0; 

} 


The  final  part  of  the  initialization  is  the  calculation  of  the  initial  placement  of  the 
clock  hands.  The  maruti_get_current_timelsystem  call  returns  the  current  system  time, 
given  as  a  mtime  structure.  The  system  time  is  kept  just  as  in  Unix — as  the  number  of 
seconds  and  microseconds  since  the  Epoch  time,  defined  as  00:00  GMT  on  January  1 , 
1970.  The  maruti_time_to_date  library  routine  does  the  job  of  calculating  the  date  and 
time-of-day  from  an  mtime  value. 


entry  sec_hand0 

{ 

static  int  erase  =  0; 


if(erase)  xor_ray(sec_pos,  WHITE); 
else  erase  =  1; 
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sec_pos  =  (sec_pos  +  POS_PER_TICMARK)  %  NUM_POSmONS; 
xor_ray(sec_pos,  WHITE); 


check_for_qnitkey  () ; 

} 


The  periodic  function  sec_hand  will  be  run  once  per  second.  It  erases  the 
previously  placed  second-hand  ray,  calculates  the  new  position  and  draws  again  there.  The 
check_for_quitkey  subroutine  (shown  below)  will  poll  the  keyboard  and  exit  the 
application  if  a  key  is  pressed. 


entry  min_hand() 

{ 

static  int  erase  =  0; 

if  (erase)  xor_triangle(min_pos,  MIN_RADIUS,  MIN_COLOR); 
else  erase  =  1; 

min_pos  =  (min_pos  +  1)  %  NUM_POSITIONS; 
xor_triangle(min_pos,  MINJRADIUS,  MIN_COLOR); 

} 

entry  hour_hand() 

{ 

static  int  erase  =  0; 

if(erase)  xor_triangle(hour_pos,  HOUR_RADIUS,HOUR_COLOR); 
else  erase  =  1; 

hour_pos  =  (hour_pos  +  1)  %  NUM_POSITIONS; 
xor_triangle(hour_pos,  HOUR_RADIUS,  HOUR_COLOR); 

} 


The  minjmnd  and  hour_hand  periodic  functions  update  their  respective  hand  positions  by 
one  each  time  they  are  called.  The  second  hand  jumps  forward  one  second  each  time  it  is 
called,  but  the  minute  and  hour  hands  creep  forward  in  smaller  relative  increments  (rather 
than  jumping  forward  once  per  minute  or  hour,  which  would  not  look  right). 


void  polar_point(int  pos,  int  radius,  int  *x,  int  *y) 

{ 

double  angle  =  (2.0*M  J>I/NUM_POSmONS)  *  (NUM_POSITIONS-pos)  + 

M_PI/2; 
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*x  =  CENTER_X  +  cos(angle)  *  radius; 
*y  =  CENTER_Y  -  sin(angle)  *  radius; 

} 


Finally  we  come  to  the  helper  functions.  The  polar_point  function  converts  from 
our  convenient  “positions"  to  real  angles  in  radians,  taking  into  account  that  radians  start  at 
the  right  and  run  counter-clockwise,  whereas  our  positions  start  at  the  top  and  run 
clockwise.  Given  an  angle  in  radians  and  a  radius  from  the  center,  the  x  and  y  coordinates 
of  the  point  are  found  by  taking  the  cosine  and  sine  of  the  angle.  The  final  twist  is  that  in 
cartesian  coordinates,  the  y  axis  points  up,  whereas  in  screen  coordinates  it  traditionally 
points  down,  so  the  y  coordinate  must  be  flipped  around. 


void  xor_ray(int  pos,  int  color) 

{ 

int  x,  y; 

polar_point(pos,  SEC_RADIUS,  &x,  &y); 
cons_xor_line(CENTER_X,  CENTERJY,  x,  y,  color); 

} 

void  xor_triangle(int  pos,  int  apex_radius,  int  color) 

{ 

int  xbl,  ybl,  xb2,  yb2,  xp,  yp; 
int  bp  1,  bp2; 

bpl  =  (pos  +  TRIANGLEJBASEL/2)  %  NUMPOSITIONS; 
bp2  =  (pos  -  TRIANGLE_BASEL/2)  %  NUM_POSmONS; 

polar_point(bpl,  TRIANGLEJBASER,  &xbl,  &ybl); 
polar_point(bp2,  TRIANGLE_BASER,  &xb2,  &yb2); 
polar_point(pos,  apexjradius,  &xp,  &yp); 


cons_xor_line(xbl,  ybl,  xb2,  yb2,  color);  /*  base  of  triangle  */ 
cons_xor_line(xbl,  ybl,  xp,  yp,  color);  /*  first  arm  */ 
cons_xor_line(xb2,  yb2,  xp,  yp,  color);  /*  second  arm  */ 

} _ 

These  graphic  helper  routines  draw  the  line  for  the  second  hand  and  the  triangle  for 
the  minute  and  hour  hands.  The  cons_xor_line  routine  is  similar  to  cons_draw_Iine,  but 
exclusive-or's  its  pixels  with  the  screen  rather  than  just  painting  them.  The  xor  technique  is 
often  used  in  graphics  programming  because  it  allows  the  drawing  and  erasing  of  objects 
without  disturbing  the  background.  When  multiple  objects  overlap,  the  overlapping 
portions  may  become  a  strange  color  due  to  xor'ing,  but  you  are  guaranteed  that  when  die 
objects  are  erased  by  xor'ing  them  a  second  time  in  the  same  location,  whatever  color  was 
there  before  will  be  restored. 
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void  check_for_quitkey(void) 

{ 

console_event_t  ev; 


if(cons_poll_event(&ev)  !=  0  &&  ev.device  =  E VENT_KE YB O ARD 
&&  ev.keycode  —  KEY_SPACE) 

quitO; 

} 


The  final  helper  routine  polls  the  console  keyboard  for  events,  and  quits  the 
application  if  the  space  bar  is  pressed.  The  cons_poll_eventlsystem  call  reports  both  key 
press  and  key  release  events,  and  reports  a  scan-code  rather  than  an  ASCII  value.  This 
interface  is  rather  low  level,  but  allows  the  application  complete  access  to  the  up/down  state 
of  every  key  on  the  keyboard. 

This  completes  the  clock.mpl  module.  The  clocLcfg  config  file  follows: 


#include  "clock.h" 
application  clock  { 

job  secjob;  init  secjob:  period  SEC_PERIOD  s;  /*  jobs  */ 
job  minjob;  init  minjob:  period  MIN_PERIOD  s; 
job  hourjob;  init  hourjob:  period  HOUR_PERIOD  s; 

task  ct;  start  ct:  clock;  /*  task  */ 

o  ct.sec_hand  o  in  secjob;  /*  threads*/ 

o  ctminjiand  o  in  minjob; 
o  ct.hour_hand  o  in  hourjob; 

} 


Notice  that  the  config  file  can  include  header  files  just  like  the  MPL  module  can. 
This  allows  the  programmer  to  put  configuration-related  constants  in  one  header  and  use 
them  in  both  the  config  file  and  the  application  modules. 

The  clock  config  simply  creates  one  task,  plus  a  job  for  each  hand  of  the  clock. 
The  periods  are  defined  in  “clock.h”: 


#define  INNER JIADIUS  235 

#define  OUTER.RADIUS  (INNER_RADIUS+15) 

#define  NUMBER JRADIU S  (OUTER_RADIUS+15) 

#define  TRIANGLE_B  ASER  30 
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#defme  TRIANGLE_BASEL  50 


#define  SEC_RADIUS  INNER_RADIUS 

#define  MIN_COLOR  YELLOW 
#define  MIN_RADIUS  (SEC_RADIUS-50) 

#defme  HOUR_COLOR  GREEN 
#define  HOUR_RADIUS  (MIN_RADIUS-50) 

#defme  NUM_POSITIONS  240 

#define  POS_PER_TICMARK  (NUM_POSITIONS/60) 

#defme  SECLPERIOD  1  /*  jumps  1  tickmark/sec  */ 

#define  MIN_PERIOD  (60/POS_PER_TICMARK)  /*  creeps  1  tickmark/min  */ 
#define  HOUR_PERIOD  (3600/5/POS_PER_TICMARK)  /*  creeps  5  tickmarks/hr  */ 


First,  a  number  of  constants  describing  the  visual  appearance  of  the  clock  face  are 
defined.  These  can  be  modified  to  taste. 

Second,  the  timing  characteristics  of  the  program  are  given.  The  key  parameter  is 
NUM_POSmONS,  which  gives  the  number  of  positions  which  the  minute  and  hour 
hands  take  around  the  clock  face.  The  larger  this  number,  the  smaller  the  distance  the 
hands  move  each  time,  and  the  more  frequently  their  jobs  are  executed.  The  minute  hand 
must  move  through  all  60  tick  marks  once  every  hour,  and  the  hour  hand  5  tick  marks  each 
hour.  With  NUM_POSITIONS  set  to  240,  each  hand  moves  four  times  for  each  tick  mark 
on  the  face  of  the  clock,  which  works  out  to  one  move  every  15  seconds  for  the  minute 
hand,  and  one  move  every  180  seconds  for  the  hour  hand. 
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Chapter  3 

MPL  /  C  References 

Maniti  Programming  Language  (MPL)  is  a  simple  extension  to  ANSI  C  to  support 
modules,  synchronization  and  communications  primitives,  and  shared  memory  variables. 
MPL  adds  some  restrictions  that  enable  analysis  of  the  CPU  and  memory  requirements  of 
the  program.  This  chapter  will  define  the  MPL-specific  features  that  differ  from  ANSI  C. 

3.1.  EBNF  Syntax  Notation 

In  this  manual,  syntax  is  given  in  Extended  Backus-Naur  Formalism  (EBNF).  In  this 
notation: 


•  literal  strings  are  quoted,  e.g.  'module'. 

•  other  terminal  symbols  are  bracketed,  e.g.  <module-name>. 

•  XIY  denotes  alternatives. 

•  {Z}  denotes  zero-or-more. 

•  {X}  denotes  zero-or-one. 

32.  MPL  Modules 

The  module  is  the  compilation  unit  in  MPL.  It  is  presented  to  the  MPL  compiler  as  one 
file,  but  may  contain  normal  C  #include  directives  so  that  the  parts  of  the  module  can  be 
kept  as  distinct  files.  The  MPL  compiler  generates  a  binary  object  file  for  the  module,  as 
well  as  a  partial  EU  graph  file  for  the  module,  which  contains  information  about  the 
module  needed  by  the  Maruti  analysis  tools. 

At  runtime,  each  MPL  module  is  mapped  to  a  Maruti  task,  which  logically  runs  in 
its  own  address  space.  Communication  between  tasks  is  through  channels  or  shared 
blocks.  Each  task  can  contain  multiple  threads  of  execution,  each  thread  corresponding  to 
an  entry  or  service  function  of  MPL. 

Each  module  starts  with  the  module  name  declaration: 


module_name_spec  ::=  'module'  <module-name>. 

33.  Module  Initialization 

When  the  task  corresponding  to  a  module  is  loaded,  the  Maruti  runtime  system  executes  a 
non-real-time  initializer  function  provided  by  the  programmer.  The  initializer  is  a  normal  C 
function,  but  it  must  be  present  in  every  module.  It  is  declared  as: 

int  maruti_main(int  argc,  char  **argv); 
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The  job  of  this  function  is  to  initialize  the  state  of  the  task,  taking  any  parameter 
values  into  account.  If  the  initializer  returns  0,  then  the  task  is  considered  successfully 
loaded,  otherwise  the  load  fails.  The  initializer  thread  can  not  send  or  receive  messages  on 
Maruti  channels. 

3.4.  Entry  Functions 

Maruti  entry  functions  occur  as  top-level  definitions  in  the  MPL  source  file,  similar  in 
syntax  to  normal  C  function  definitions. 


entry-function  ::=  'entry'  <entry-name>  (' ')'  entry-function-body, 
entry-function-body ::  =  channel-declaration-list  c-function-body. 

Entry  functions  serve  as  the  top-level  function  of  a  Maruti  thread  which  is  invoked 
repeatedly  with  a  period  as  specified  externally,  in  the  MCL  configuration.  Multiple 
instances  of  the  entry  thread  can  be  active  in  a  single  task  at  runtime,  so  care  must  be  taken 
to  protect  accesses  to  shared  data  with  a  region  or  local_region  construct. 

35.  Service  Functions 


Maruti  service  functions  also  occur  as  top-level  definitions  in  the  MPL  source  file. 


service-function  ::=  'service'  <service-name> 

,('<in-channel-name>':'<type_specifier>','<msg-ptr-name>')’ 

service-function-body. 

service-function-body ::  =  channel-declaration-list  c-function-body. 

Services  are  declared  with  the  initiating  channel  and  pointer  to  a  message  buffer.  A 
service  thread  is  invoked  whenever  a  message  on  the  channel  has  been  received,  thus  it 
inherits  the  scheduling  characteristics  of  the  sender  to  the  channel.  Multiple  instances  of  the 
service  may  be  active  in  a  single  task  at  the  same  time,  servicing  messages  from  different 
senders,  so  care  must  be  taken  to  protect  accesses  to  shared  data  with  a  region  or 
localjregion  construct. 

The  receipt  of  the  invoking  message  into  private  storage  is  automatic,  and  the 
service  function  is  called  with  a  pointer  to  the  message  buffer.  Fore  example,  given  the 
service  declaration: 


service  consumer(inch:  ch_type,  msg)  {  ...  } 


The  service  is  actually  invoked  as  if  it  were  a  C  function  declared: 


void  consumer(ch_type  *msg)  {  ...  } 
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3.6.  MPL  Channels 


In  Maruti,  channels  are  one-way,  typed,  communications  paths  whose  traffic  patterns  are 
analyzed  and  scheduled  by  the  system.  The  channel  end-points  are  declared  as  part  of  the 
entry  or  service  functions  which  take  part  in  the  communication.  The  endpoints  are 
connected  in  the  MCL  configuration  for  the  application. 


The  syntax  of  MPL  channels  is  similar  to  a  C  variable  declaration: 


channel-declaration-list  ::=  [  channel-decl  {  channel-decl }]. 
channel-declaration  ::=  channel-type  channel  {  channel } 

channel-type  ::=  'out'  I  'in'  I  'in_first'  I  'in_last'. 
channel  ::=  <channel-name> ':'  type-specifier. 

A  channel  endpoint  declaration  will  normally  be  either  an  out  endpoint  or  an  in 
endpoint,  used  in  the  sending  thread  and  receiving  thread,  respectively.  There  are  two 
special  variants  of  in  endpoint,  in_first  and  in_last,  which  denote  asynchronous  channels  in 
which  the  communications  will  not  be  scheduled,  and  the  input  buffers  are  allowed  to 
overflow.  For  in_first  channels,  the  first  messages  received  will  be  retained  and  the  rest 
dropped,  for  in_last  the  most  recent  messages  will  be  retained  and  older  messages 
overwritten. 

3.7.  Communication  Primitives 

The  message  passing  primitives  appear  as  normal  C  function  calls,  but  they  are  built  in 
primitives  of  die  MPL  compiler,  and  their  use  is  recorded  so  that  the  communications  on 
the  channel  can  be  analyzed. 

The  three  primitives  each  take  a  channel  name  and  a  pointer  to  a  message  buffer. 
Their  declarations  would  look  something  like  this: 

void  send  (out  ch_name,  ch_type*  message_ptr); 
void  receive  (in  ch_name,  ch_type*  message_ptr); 
int  optreceive(in  ch_name,  ch_type*  message_ptr); 

There  are  two  variants  of  the  receive  primitive.  A  noimal  receive  is  used  in  most  cases,  and 
it  raises  an  exception  if  there  is  no  message  delivered  at  the  time  it  is  executed.  Normally 
the  Maruti  scheduler  will  arrange  things  so  that  this  is  never  happens.  When  messages 
might  not  be  present  when  the  receiver  is  run,  as  when  threads  are  communicating 
asynchronously  with  in_first  and  in_last  channels,  or  when  the  sender  sometimes  will  not 
send  the  message  due  to  run  time  conditions,  an  optreceive  must  be  used.  The  optreceive 
variant  checks  if  a  message  is  present,  and  receives  it  if  so.  It  returns  1  if  a  message  was 
delivered,  or  0  if  no  message  was  delivered. 
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3.8.  Critical  Regions 

Mutual  exclusion  is  often  necessary  to  prevent  the  corruption  of  data  structures  modified 
and  accessed  by  concurrent  threads.  In  Maruti,  the  region  statement  delineates  a  critical 
region. 

region-statement  ::=  ('region'llocal_region’)  <region-name>  c-statement. 

The  local  region  variant  is  used  within  a  task,  usually  to  serialize  multiple  thread  access  to 
data  structures.  The  region  variant  is  global  to  the  application,  and  is  used  to  serialize 
access  to  shared  buffers  and  other  application-defined  resources,  as  specified  in  the  MCL 
configuration  for  the  application. 

3.9.  Shared  Buffers 

Finally,  MPL  adds  shared  buffers  to  the  C  language.  Shared  buffers  declarations  are 
similar  in  syntax  to  typedef  declarations: 

shared-buffer-decl  ::=  'shared'  <type-specifier>  <shared-buffer-name>. 

The  shared  buffer  declaration  is  effectively  a  pointer  declaration.  For 
example: 


shared  some_type  shared_buffer; 


is  treated  as  if  it  were  a  declaration  of  the  form: 


some_type  *shared_buffer  =  &some_buffer; 

The  MPL  specification  for  the  application  determines  which  tasks  share  each  shared 
memory  area.  The  runtime  system  takes  care  of  allocating  memory  for  the  shared  buffers, 
and  initializing  the  buffer  pointers.  The  MPL  program  can  at  all  times  dereference  the 
pointer. 

3.10.  Restrictions  to  ANSI  C  in  MPL 

The  Maruti  real-time  scheduling  methodology  requires  that  the  tools  be  able  to  analyze  the 
control  flow  and  stack  usage  of  the  MPL  programs,  and  that  synchronization  points  be  well 
known.  Thus  the  following  restrictions  to  ANSI  C  must  be  followed  by  the  MPL 
programmer 

•  No  receive  primitives  are  allowed  within  either  loops  or  conditionals. 

•  No  region  construct  are  allowed  within  either  loops  or  conditionals. 

•  No  send  primitive  within  a  loop. 

•  Direct  or  indirect  recursion  is  not  allowed. 

•  Function  calls  via  function  pointers  should  not  be  used. 
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Chapter  4 

MCL  Reference 


Maruti  Configuration  Language  (MCL)  is  used  to  specify  how  individual  program  modules 
are  to  be  connected  together  to  form  an  application  and  to  specify  the  details  of  the 
hardware  platform  on  which  the  application  is  to  be  executed. 

MCL  is  an  interpreted  C-like  language.  The  MCL  processor  is  called  the 
integrator.  The  integrator  interprets  the  instructions  of  the  MCL  program,  instantiating  and 
connecting  the  components  of  the  application,  checking  for  type  correctness  as  it  goes,  and 
outputs  the  application  graph  and  all  allocation  and  scheduling  constraints  for  further 
processing  by  other  Maruti  tools. 

4.1.  Top-level  Declarations 

Like  a  C  program,  an  MCL  configuration  file  is  composed  of  a  number  of  top-level 
declarations.  The  C  preprocessor  is  invoked  first,  so  the  configuration  file  may  contain 
#include  and  #define  directives  to  make  the  configuration  very  customizable. 

configuration  ::=  {toplevel-declaration}. 
toplevel-declaration  ::=  variable-declaration  I  system-declaration 
block-declaration  I  application-declaration. 

The  declarations  may  occur  in  any  order — they  do  not  have  to  be  defined  before  used.  The 
four  types  of  top  level  declaration  are  described  in  more  detail  below. 

4.1.1.  The  Application  Declaration 

application-declaration  ::=  'application'  <application-name> 

'{'  {instruction} '}'. 

Like  the  main  function  of  C,  the  application  declaration  is  where  the  integrator  will  begin 
execution  of  the  configuration  directives.  Only  one  application  may  be  declared  in  the 
configuration. 

4.1.2.  The  System  Declaration 

system-declaration  ::=  'system'  <system-name> 

'{'  {node-declaration} '}'. 

node-declaration  ::=  'node'  <variable-name>  ['with'  attributes], 
attributes  :  :=  attribute  { ’,'  attribute } . 

attribute  ::=  <symbol>  ['='  <integer>  I  '='  <symbol>  I  '=’  <string>]. 


Like  the  application  declaration,  the  system  declaration  can  occur  at  most  once  in  a 
configuration.  It  is  not  needed  for  single-node  operation.  The  system  declaration  names 
the  nodes  that  an  application  will  run  on,  and  specifies  attributes  for  them.  For  example: 
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system  hdw  { 

node  northstar  with  address  = "  { 0x00, 0x60, 0x8c,0xb  1 ,0xfb,0xc6 } " ,  master; 
node  raduga  with  address  = "  { 0x00, 0x60,0x8c, Oxb  1 ,0xf6,0x67 } " ; 

} 


The  integrator  does  not  assign  any  meaning  to  the  attributes  declared  for  the  nodes, 
it  just  passes  die  information  along.  However,  the  Maruti  binder  does  require  the 
addresslattribute  for  each  node,  which  specifies  the  node’s  ethemet  address,  and  the  master 
attribute  on  only  one  node,  to  specify  which  node  will  be  the  boot  and  time  master.  The 
Marati/Virtual  environment  further  requires  that  the  node  <variable-name>  correspond  to 
the  hostname  of  the  node  in  the  testbed  environment. 

4.1.3.  Block  Declarations 

block-declaration  ::=  'block'  <block-name>  '('  [block-parameters] ')' 
block-parameter-channels 
'{'  {instruction}  '}'. 

A  block  is  something  like  a  function  in  C.  When  a  block  is  declared,  it  may  be 
called  by  any  other  block,  except  that  no  self-recursion  is  allowed.  A  block  can  not  be 
declared  inside  another  block.  A  block  is  called  by  giving  its  name  and  parameters.  There 
are  2  kinds  of  parameters:  classical  parameters  and  channel  parameters. 

block-parameters  ::=  parameter  {  ','  parameter  }. 
parameter  ['var']  <parameter-name>  ['[]']  type]. 

Classical  parameters  are  like  function  parameters  in  C  or  Pascal.  They  can  be 
passed  by  value  or  by  variable  (var  for  variable  passing).  Arrays  may  also  be  be  given  as 
var  parameters.  The  type  of  the  parameter  must  be  given  for  the  first  parameter.  It  may  be 
omitted  for  following  parameters:  the  integrator  will  assume  that  the  parameter  with  no 
given  type  has  the  same  type  as  the  previous  parameter. 

block-parameter-channels  ::=  {  ('inTout')  channel-names  ';'  }. 
channel-names  ::=  channel  {  ','  channel }. 
channel  ::=  <channel-name>  [ '['  <integer>  ']'  ]. 

The  channel  parameters  decribe  the  inputs  and  outputs  of  the  block.  The  in  and  out 
keywords  do  not  have  exactly  the  same  meaning  has  in  MPL:  they  only  show  which 
channels  are  connected  at  the  left  and  which  are  connected  at  the  right  of  the  block  call  (see 
connection  below).  The  communication  type  of  the  channel  in_first,  in_last,  or 
synchronous)  and  the  type  of  the  messages  on  the  channel  are  determined  by  the 
connections  of  the  channels  to  the  tasks. 

When  there  is  an  array  of  channel  parameters,  the  connections  will  occur  in 
ascending  order.  For  example: 

block  foo() 
inch[3]; 

{  } 
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application  bar  { 
channel  i[3]* 

<a[0..2]>foo()  o;  /*  a[0]->ch[0],  a[l]->ch[2],  a[2]->ch[2]  */ 

} 


4.1.4.  Variable  Declarations 

variable-declaration  ::=  type  variable-names 

type  ::=  'float'  I  'int'  I  'string'  I  'time'  I  'channel' 

I  'task'  I  'job'  I  'node'  I  'shared'  I  'region'. 

variable-names  ::=  variable  {  ','  variable  }. 

variable  ::=  <variable-name>  ['['  <integer>  *]']  ['['  <integer>  ']']. 

Variables  may  be  declared  globally  at  the  top-level,  or  locally  in  a  block.  Global 
variables  can  be  accessed  in  all  blocks,  while  local  variables  can  only  be  accessed  in  the 
block  where  they  are  declared.  A  local  variable  (or  a  parameter)  may  be  declared  with  the 
same  name  as  a  global  variable.  In  this  case  only  the  local  variable  (or  the  parameter)  can 
be  accessed  in  the  block. 

The  order  of  the  variable  declarations  does  not  matter.  For  example: 


block  foo() 

{ 

i  =  4s  +  5mn;  /*  correct  */ 
timei; 

} 


Arrays  may  be  declared.  As  in  C,  the  array  indides  are  numbered  from  0  to  size-of- 
array  less  1.  Arrays  of  1  or  2  dimensions  are  accepted.  For  example: 


-  block  foo() 

{ 

string  s[10]; 

s[5]  =  "a  string";  /*  correct  *1 

s[0]  =  s[5]  + "  foo";  /*  correct  *1 

s[10]  = /*  incorrect:  out  of  array  limits  */ 

} 


42.  Instructions 

The  MCL  integrator  interprets  a  number  of  instructions  that  express  the  way  an  application 
is  to  be  built  up  from  components.  The  different  instructions  are  explained  below. 

instruction  ::=  variable-declaration 
I  task-initialization 
I  job-initialization 
I  connect-declaration 
I  link-intruction 
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I  allocation-instruction 
I  expression 
I  print-instruction 
I  compound-instruction 
I '{'  {instruction} 

4.2.1.  Compound  Instructions 

compound-instruction  ::= 

'if  '('  test-expression  ')'  induction 
I  'if  '('  test-expression  ')'  induction  'else'  instruction 
I  'do'  instruction  'while' '('  test-expression  ')' ';' 

I  'while' '('  test-expression  ')'  instruction 
I  'for' '('  expression  ';'  test-expression  expression  ')' 
instruction. 

test-expression  ::=  expression. 

The  meaning  of  these  constructs  is  the  same  as  in  the  C  language.  The  test- 
expression  should  evaluate  to  an  integer,  where  0  means  false,  and  all  other  values  mean 
true. 

4.2.2.  Tasks 


task-initialization  ::=  'start'  names  ':'  <module-name>  [module-parms] 

[instantiation]  [task-allocation] 

module-parms  ::=  '('  [module-parameter-list]  ’)’. 
module-parameter-list  expression  {','  expression}. 

instantiation  ::=  'with'  <symbol>  '='  constant 
{','  <symbol>  '='  constant }. 

task-allocation  ::=  'on'  expression. 

A  variable  of  type  task  must  be  initialized  before  it  can  be  used.  This  initialization  consists 
of  giving  the  name  of  a  module:  the  task  will  be  an  instantiation  of  this  module.  Module 
parameters  may  be  given:  after  evaluation  they  will  be  given  to  the  initia1i7P.r  thread  of  the 
module.  The  initializations  during  the  loading  of  an  application  will  take  place  in  exactly 
the  same  order  as  thay  are  found  by  the  intergrator  during  the  execution  of  the 
configuration. 

All  the  shared  buffers  and  the  global  regions  of  the  module  must  be  instantiated 
using  the  with  clause:  the  corresponding  shared  or  region  variables  must  be  given. 

The  on  clause  may  be  used  to  force  allocation  of  the  task  on  a  particular  node. 

4.2.3.  Job  Initialization 

job-initialization  ::=  'init'  names ':'  timing-job 
timing-job  ::=  {  'period'  expression  }. 

A  variable  of  type  job  must  be  initialized  before  it  can  be  used.  The  job  will  refer  to  a 
collection  of  threads  with  the  same  period. 
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4.2.4.  Connections 

connect-declaration  ::=  chan-list  connect-name  chan-list 
[in-job]  {timing-service}  [task-alloc]';'. 
chan-list  ::=  '<'  [names] 

connect-name  ::=  <task-name>  ['['expression']']  V  <routine-name> 

I  <block-name>  ’('  [expression  expression}] ')'. 
in-job  ::=  'in'  constant. 

timing-service  ::=  ( 'ready'  expression  I  'deadline'  expression ). 

There  are  two  types  of  connections:  routine  connections  and  block  connections.  In  both 
cases  the  inputs  are  connected  (or  mapped)  to  the  channels  declared  at  the  left  of  the 
connection  and  the  outputs  at  the  right.  The  number  of  input  (or  ouput)  channels  must  be 
the  same  as  in  the  definition  of  the  routine  (or  the  block).  The  mapping  is  done  following 
the  order  of  this  definition. 

In  a  routine  connection  the  inputs  and  ouputs  of  an  entry  or  a  service  of  a  task  are 
connected  to  channels.  This  connection  creates  a  new  instance  of  a  service  if  the  routine 
was  a  service,  otherwise  it  creates  the  only  instance  of  an  entry.  An  entry  can  not  be 
connected  many  times. 

For  an  entry  connection  a  job  name  must  be  given,  the  entry  will  be  a  part  of  this 
job.  For  a  service,  a  job  can  not  be  declared:  the  job  of  the  service  is  implicitly  given  by  the 
connection:  the  first  input  channel  of  a  service  is  the  triggering  channel  of  the  service.  The 
job  of  the  service  is  the  same  as  the  job  of  the  origine  of  die  triggering  channel. 

A  timing  characterization  may  only  be  given  to  a  routine  connection. 

In  a  block  connection  the  input  and  output  channels  of  the  block  are  mapped  to  the 
given  channels.  A  mapping  is  also  done  for  all  the  block  parameters,  following  die  order  in 
die  block  definition.  The  number  of  parameters  must  be  the  same  as  in  this  definition,  and 
all  the  types  must  be  coherent 

4.2.5.  Allocation  Instructions 

allocation-instruction  ::=  'separate' '('  names  ')' 

I  'together' '('  names  ')' 

A  separate  instruction  is  a  command  to  the  allocator  to  keep  the  tasks  on  different  nodes  in 
the  final  system.  A  together  instruction  specifies  that  all  tasks  must  be  allocated  to  the 
samenode. 


4.2.6.  Link  Instruction 

link-intruction  ::=  link'  expression  'to'  expression 

In  a  few  cases  the  connections  are  not  sufficient  to  describe  a  communication  graph  with 
the  structure  of  the  blocks.  In  these  cases  a  link  instruction  may  be  used. 

A  link  between  two  channels  means  that  the  two  channels  are  the  same. 

Example:  if  we  want  to  connect  directly  an  input  and  an  output  channel  of  a  block  a 
link  must  be  used. 
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block  fooO 
in  in_channel; 
out  out_channel; 

{ 

link  in_channel  to  out_channel; 

} 


4.2.7.  Print  Instruction 

print-instruction  ::=  'print  ('  expression  expression} 

The  print  instruction  outputs  messages  to  the  standard  output  during  integration.  This 
instruction  can  be  used  for  the  debugging  of  a  configuration  file.  Any  string,  number,  or 
time  may  be  printed.  A  newline  is  added  at  the  end. 


43.  Expressions 

Expressions  in  MCL  are  very  similar  to  C  expressions: 


expression  ::=  expression  '='  expression 
I  expression  'll'  expression 
I  expression  '&&'  expression 
I  expression  '='  expression 
I  expression  '!='  expression 
I  expression  '<’  expression 
I  expression  ’>'  expression 
I  expression  '<='  expression 
I  expression  ’>='  expression 
I  expression  ('dThTmnTsTmsTus') 

I  expression  '+'  expression 
I  expression '-'  expression 
I  expression  expression 
I  expression  7'  expression 
I  expression  '%'  expression 
I  '!'  expression 
I  '('  expression  ')' 

I  expression  '++' 

I  expression  '-' 

I  constant 


tant  ::=  <symbol>  [ '['  expression  ']'  3  [  T  expression  ']'  ] 

I  <symbol>  '['  expression  expression  ']'  [  '['  expression  ']'  ] 
I  <symbol>  '['  expression  ']'  '['  expression  expression  ']' 

I  <integer> 

I  <double> 

I  <string>. 
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In  addition  to  the  usual  C  expressions,  MCL  supports  time  unit  expressions,  for  example, 
‘3  s  +  500  ms’  is  a  time  expression  that  evaluates  to  3.5  seconds. 

Also,  MCL  supports  array  range  notation  as  a  shorthand  for  lists.  For  example,  the 
expression  ‘c[2..4]’  is  shorthand  for  c[2],  c[3],  c[4]’.  This  notation  is  most  often  used  for 
passing  arrays  of  channel  values  to  blocks  or  in  connection  instructions. 
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CHAPTER  5 

Maruti  Runtime  System  Reference 

The  Maruti  runtime  system  is  bound  together  with  the  application  binary  files  by  the  mbind 
utility.  Only  those  parts  of  the  runtime  needed  by  the  application  are  linked  in.  There  are 
several  versions  of  the  runtime  system  available  depending  on  the  environment  in  which  the 
application  will  be  run.  For  example,  there  are  two  different  versions  of  the  core  library:  a 
stand-alone  version  that  can  boot  directly  on  bare  hardware,  and  a  Unix  version  that  runs  as 
a  user-level  process  under  Unix,  providing  virtual-time  execution  and  access  to  debugging 
tools. 

The  set  of  library  versions  that  an  application  links  with  are  called  flavors.  Flavors 
are  specified  by  the  programmer  as  strings  of  library  names  separated  by  a  V,  for 
example,  ‘ux+xll’. 

5.1.  Core  Library  Reference 


#include  <maruti/maruti-core.h> 

The  Maruti  core  library  implements  the  scheduling,  thread  and  memory  management,  and 
network  communication  subsystem.  It  provides  primitives  for  applications  to  send  and 
receive  messages,  insert  preemption  points,  manipulate  the  schedule  (via  calendars),  and  do 
time  and  date  calculations.  There  are  currently  two  flavors  of  the  core  library: 

•  sa  --  The  Maruti/Standalone  core  library.  Applications  linked  with  this  flavor  can  be 
booted  directly  (by  the  NetBSD  boot  blocks).  It  includes  the  distributed  operation 
support,  based  on  the  3Com  3c507  Etherlink/16  adapter. 

•  ux  —  The  Maruti/Virtual  debugging  core  library.  Applications  linked  with  this  flavor  are 
run  as  normal  Unix  processes  from  the  NetBSD  command  line.  It  includes  a  virtual¬ 
time  scheduler  and  debugging  monitor  (described  below)  and  implements  distributed 
operation  using  normal  Unix  TCP/IP  networking  facilities. 


5.1.1.  MPL  Built-in  Primitives 

void  maruti_eu(void) 

The  maruti_eu  primitive  inserts  a  Maruti  EU  break  into  the  program  at  the  location  of  the 
call.  It  is  not  normally  used  explicitly  in  an  application,  as  the  system  tools  put  EU  breaks 
where  necessary  for  synchronization.  It  is  useful,  however,  for  breaking  up  long-running 
EUs  -  the  maruti_eu  then  serves  as  a  possible  preemption  point 


void  send(out  ch_name,  chjype*  message_ptr) 
void  receive(in  ch_name,  chjype*  message_ptr) 
int  optreceive(in  ch_name,  chjype*  message_ptr) 
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The  communications  primitives  are  documented  in  section  3.7.  in  the  MCL  Reference 
Chapter. 

5.1.2.  Calendar  Switching 

int  maruti_calendar_activate(int  calendar_num,  mtime  switch_time,  mtime  offset_time) 
void  maruti_calendar_deactivate(int  calendar_num,  mtime  switch_time) 

-  Maruti  calendars  may  be  activated  and  deactivated  ( switched  on  or  off)  at  any  time. 
The  switch_time  is  the  time  at  which  the  de/activation  should  take  place.  The  switch  can 
occur  at  any  point  in  the  future,  and  the  switch  requests  can  come  out  of  order  with  respect 
to  the  switch  time.  Requests  with  the  same  switch  time  are  executed  in  the  order  of  the 
requests. 

Calendars  can  be  activated  with  a  particular  offset_time,  which  is  the  relative 
position  within  the  calendar  to  start  executing  at  the  switch  time.  The  offset  time  will 
normally  be  zero,  but  can  be  any  relative  time  up  to  the  1cm  time  of  the  calendar. 

The  runtime  system  does  not  check  the  feasibility  of  the  combined  schedules 
represented  by  the  calendars  -  that  should  be  done  offline. 

5.1.3.  Calendar  Modification 

void  maruti_calendar_create(int  calendar_num,  int  num_entries,  mtime  lcmjime)  void 
maruti_calendar_delete(int  calendar_num) 

Calendars  are  normally  created  offline  and  compiled  into  the  Maruti  application,  but 
it  is  possible  to  create  new  calendars  at  runtime.  The  application  is  responsible  for  insuring 
that  the  generated  schedules  are  feasible. 

When  a  calendar  is  created,  the  maximum  number  of  entries  it  will  contain  must  be 
specified,  as  well  as  the  lcm_time,  which  is  the  period  of  the  calendar  as  a  whole.  At  the 
end  of  its  period,  the  calendar  will  wrap  around  and  begin  executing  from  the  beginning 
again. 


typedef  struct  calendars  { 
entry_t  ^entries; 
int  num_entries; 
mtime  lcm_time; 
mtime  base_time; 

entry_t  *cur;  /*  cur-entries  is  the  current  offset  */ 

<...> 

}  calendarj; 
typedef  struct  { 
int  eu_thread,  eu_id; 
mtime  eu_start,  eu_deadline; 
int  eu_type; 

#  define  EU_EMPTY  0  /*  empty  EU  slot  */ 

#  define  EU_PERIODIC  1  /*  periodic  EU  */ 

<...> 

}  entry_t; 
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void  mamti_calendar_get_header(mt  calendar_num,  calendar_t  *calendaip) 
void  maruti_calendar _get_entry(int  calendar_num,  int  entry_mun,  entry_t  *entryp) 
void  maruti_calendar_set_entry(int  calendar_num,  int  entiy_num,  entry_t  entry) 

The  maruti_calendar_set_entry  call  is  used  to  populate  new  calendars.  It  can 
overwrite  any  entry  in  any  inactive  calendar.  The  entry  eu_start  and  eu_deadline  times  are 
the  earliest  start  time  and  latest  end  time,  respectively.  The  eu_id  serves  to  identify  the  eu 
when  tracing  or  reporting  timing  results. 

The  maruti_calendar_get_header  and  maruti_calendar_get_entry  calls  can  be  used  to 
query  the  contents  of  a  calendar.  These  are  useful  when  ‘cloning’  an  existing  calendar  into 
a  new  calendar,  perhaps  with  modifications. 

5.1.4.  Date  and  Time  Manipulation 

#include  <maruti/mtime.h> 

The  Maruti  core  library  provides  routines  and  macros  for  simple  time  and  date  calculations. 


typedef  struct  { 
long  seconds; 
long  microseconds; 
}  mtime; 


#define  time_cmp(a,b) 

/* 

#define  time_add(a,b) 

/* 

#define  time_sub(a,b) 

/* 

#define  time_add_scalar(t,  s) 

/* 

#define  time_sub_scalar(t,  s) 

/* 

#defme  time_mul_scalar(t,  s) 

/* 

#define  time_div_scalar(t,  s) 

/* 

like  strcmp,  0  if  eq,  It  0  if  a  <  b,  etc  */ 
a  +=  b  */ 
a  -=  b  */ 

t  +=  s  (s  is  an  int,  in  microseconds)  */ 
t  -=  s  (s  is  an  int,  in  microseconds)  */ 
t  *=  s  (s  is  an  int)  */ 
t  /=  s  (s  is  an  int)  */ 


The  mtime  type  is  the  basic  Maruti  time  structure.  A  number  of  convenience  macros  for 
arithmetic  on  mtime  values  are  provided.  Two  mtime  values  may  be  compared,  added,  or 
subtracted.  In  addition,  an  integer  time  in  microseconds  may  be  added  to  and  subtracted 
from  an  mtime  value,  and  mtime  values  may  be  multiplied  or  divided  by  integer  scaling 
factors. 

Note:  The  microseconds  field  is  always  in  the  range  0  to  999999,  and  the  time 
represented  by  an  mtime  value  is  always  the  number  of  seconds  plus  the  number  of 
microseconds.  These  rules  hold  even  for  negative  mtime  values,  which  can  arise  when 
subtracting  mtimes.  Thus  the  mtime  representation  for  the  time  $-1.3$  seconds  is  {  -2, 
700000  }. 


void  maruti_get_current_time(mtime  *curtime) 

The  current  system  time  is  returned  by  maruti_get_current_time.  Maruti,  like  Unix, 
represents  absolute  time  as  the  number  of  seconds  and  microseconds  since  the  Epoch  time, 
defined  as  00:00  GMT  on  January  1, 1970. 
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typedef  struct  { 
short  year; 
short  month; 
short  wday; 
short  mday; 
short  yday; 
short  second,  minute; 
short  hour; 
int  microsecond; 

}  mdate; 


/*  year  - 1900  *1 
/*  month  (0..11)  */ 

/*  day  of  week  (0..6)  */ 
/*  day  of  month  (1.31)  */ 
/*  day  of  year  (0..365)  */ 
/*  0..59  */ 

/*  0..23  */ 

/*  0..999999  */ 


mtime  maruti_date_to_time(mdate  d) 
mdate  maruti_time_to_date(mtime  t) 


mtime  maruti_gmtdate_to_time(mdate  d) 
mdate  maruti jime_to_gmtdate(mtime  t) 


int  maruti_set_gmtoff(int  gmtoff) 
int  maruti_get_gmtoff(int  *gmtoffp) 

Applications  will  often  want  to  view  the  time  as  something  more  convenient  than  the 
number  of  seconds  since  the  Epoch.  The  Maruti  mdate  type  denotes  a  time  expressed  as  a 
date  plus  a  time  of  day.  The  functions  maruti_time_to_gmtdate  and 
maruti__gmtdate  to  time  convert  between  mtime  and  mdate  values  using  the  GMT 
timezone.  The  functions  maruti_time_to_date  and  maruti_date_to_time  convert  using  the 
local  offset  from  GMT. 

The  local  timezone  used  in  these  conversions  is  initially  set  by  the  runtime  system, 
but  may  be  changed  by  the  application.  The  timezone  is  expressed  as  an  offset  from  GMT 
in  seconds.  For  example  the  U.S.  timezone  EST  is  5  hours  behind  GMT,  or  -18000 
seconds  offset. 

Note:  Maruti  does  not  at  this  time  attempt  to  handle  leap  seconds  or  automatically 
switching  the  local  timezone  to  account  for  daylight  savings  times.  The  cost  of  providing 
these  features  in  code  and  table  space  was  deemed  prohibitive. 

5.1.5.  Miscellaneous  Functions 

void  quit(int  exitcode) 

The  quit  call  terminates  the  application.  The  exit  code  is  not  usually  relevant  in  an 
embedded  system,  but  will  be  returned  to  the  environment  where  that  makes  sense  (such  as 
in  the  Unix  debugging  environment). 


372 


52.  Console  Library  Reference 


#include  <maruti/console.h> 

The  Maruti  console  graphics  library  provides  access  to  the  console  device, 
including  the  keyboard  and  speaker,  but  most  importantly  the  graphical  display.  The 
graphics  library  includes  support  for  placing  text  anywhere  on  the  screen,  simple  2d 
geometry  primitives  suitable  for  generating  line  and  bar  graphs,  and  includes  optimized 
routines  for  moving  bitmaps  without  flicker,  for  animated  simulations.  There  are  currently 
three  flavors  of  the  graphics  library  implemented: 


•  et4k  -  This  flavor  supports  Super  VGA  graphics  cards  based  on  the  Tseng  Labs 

ET4000  chip  and  its  accelerated  descendents,  like  the  ET4000/W32.  The  et4k 
flavor  runs  the  screen  at  a  resolution  of  1024x768  in  256  color  mode. 

•  vgal6  --  This  flavor  supports  all  standard  VGA  graphics  cards,  running  the  screen 

at  a  resolution  of  640x480,  in  16  color  banked  mode. 

•  xll  —  This  flavor  works  with  the  {\bf  ux}  core  flavor,  displaying  the  Maruti  screen 

in  an  XI 1  window  under  Unix. 

5.2.1.  Screen  Colors 

The  Maruti  console  graphics  library  supports  the  following  colors,  defined  in 
<maruti/console.h>: 


#defme  BLACK 

0 

#define  DARKJBLUE 

1 

#define  D  ARK_GREEN 

2 

#define  DARKCYAN 

3 

#define  DARKJRED 

4 

#define  DARKJVTOLET 

5 

#define  DARKYELLOW 

6 

#define  DARK_WHITE 

7 

#define  BROWN 

8 

#define  BLUE 

9 

#define  GREEN 

10 

#define  CYAN 

11 

#define  RED 

12 

#define  VIOLET 

13 

#define  YELLOW 

14 

#defme  WHITE 

/*  aliases  */ 

15 

#define  GREY 

DARK_WHTTE 

#define  GRAY 

DARK  WHITE 
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The  maximum  screen  size  supported  is  also  defined: 

#define  CONSOLE_WE)TH  1024 

#define  CONSOLE_HEIGHT  768 

5.2.2.  Graphics  Functions 


void  cons_graphics_init(void) 

The  cons_graphics_init  fimction  must  be  called  before  any  other  graphics  functions, 
usually  from  the  maruti_main  function  of  the  application  s  screen  driver  task. 

void  cons_fill_area(int  x,  int  y,  int  width,  int  height,  int  color) 
void  cons_xor_area(int  x,  int  y,  int  width,  int  height,  int  color) 

These  functions  paint  an  area  of  the  screen,  specified  by  its  upper-left  coordinates 
(x,  y),  and  its  width  and  height,  in  the  given  color.  The  cons_fill_area  variant  overwrites 
the  previous  contents  of  that  area  of  the  screen, while  cons_xor_area  exclusive-or  s  the 
screen  contents  with  the  specified  color. 


function  cons_draw_pixel(int  x,  int  y,  int  color) 
function  cons_xor_pixel(int  x,  int  y,  int  color) 

These  functions  draw  and  xor,  respectively,  a  single  pixel  at  (x,  y)  in  the  specified 

color. 

void  cons_draw_line(int  xl,  int  yl,  int  x2,  int  y2,  int  color) 
void  cons_xor_line(int  xl,  int  yl,  int  x2,  int  y2,  int  color) 

These  functions  draw  and  xor,  respectively,  a  single-pixel  width  line  from 
coordinates  (xl,  yl)  to  (x2,  y2)  in  the  specified  color. 


void  cons_draw_bitmap(int  x,  int  y,  int  width,  int  height, 
void  "“bitmap,  int  color) 

void  cons_xor_bitmap(int  x,  int  y,  int  width,  int  height, 
void  "“bitmap,  int  color) 

These  functions  draw  and  xor,  respectively,  a  width-by-height  sized  bitmap  onto 
the  screen  in  the  specified  color,  with  its  upper-left  comer  at  (x,  y).  The  bitmap  is  in 
standard  X  bitmap  format,  with  eight  pixels  per  byte,  and  an  even  multiple  of  eight  pixels 
per  scan  line. 


void  cons_move_bitmap(int  xl,  int  yl,  int  x2,  int  y2,  int  width,  int  height, 
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void  *bitmap,  int  color) 

void  cons_xor_move_bitmap(int  xl,  int  yl,  int  x2,  int  y2,  int  width,  int  height, 
void  *bitmap,  int  color) 

These  functions  optimize  the  erasing  and  redrawing  of  a  bitmap  by  combining  the 
operations  into  one  loop,  modifying  one  scan-line  at  a  time.  This  optimization  eliminates 
the  flicker  that  can  occur  when  erasing  the  entire  bitmap  then  redrawing  it,  making 
animations  more  effective. 

The  call  cons_move_bitmap(xl  ,y  1  ,x2,y2,w,h,b,c)  is  equivalent  to  the  sequence: 

cons_draw_bitmap(xl,x2,w,h,b, BLACK); 
cons_draw_bitmap(x2,y2,w,h,b,c); 

The  call  cons_xor_move_bitmap(xl,yl,x2,y2,w,h,b,c)  is  equivalent  to  the  sequence: 

cons_xor_bitmap(xl  ,y  1  ,w,h,b,c); 
cons_xor_bitmap(x2,y2,  w,h,b  ,c) ; 

void  cons_puts(int  x,  int  y,  int  color,  char  *string) 
void  cons_xor_puts(int  x,  int  y,  int  color,  char  *string) 

These  functions  draw  and  xor,  respectively,  a  text  string  at  (x,  y)  in  the  specified  color. 

5.2.3.  Keyboard  and  Speaker  functions 

typedef  struct 

{ 

unsigned  char  device;  /*  just  keyboard  works  for  now  */ 

#  define  EVENT_OTHER  0 

#  define  EVENTJKEYBOARD  2 
unsigned  char  keycode; 

}  console_event_t; 

int  cons_poll_event(console_event_t  *  event) 

The  cons_poll_event  call  returns  1  if  a  console  event  has  occurred,  0  otherwise. 
There  there  is  a  pending  console  event,  the  event  structure  is  filled  in.  The  device  field  is 
set  to  EVENT_KEYBOARD  and  the  keycode  field  is  set  to  the  scan  code  of  the  key  that 
was  pressed.  The  list  of  scan  codes  is  in  <maruti/keycodes.h>. 

void  cons_start_beep  (int  pitch) 
void  cons_stop_beep(void) 
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The  console  speaker  can  be  turned  on  and  off  with  these  functions.  The 
cons_start_beep  call  programs  the  speaker  to  sound  at  a  particular  frequency,  in  hertz, 
and  cons_stop_beep  turns  it 

off. 

53.  Maruti/Virtual  Monitor 

The  ux  flavor  of  the  Maruti  core  library  includes  some  basic  debugging  facilities  called  the 
Maruti  monitor.  While  an  application  compiled  with  ux  is  running,  aspects  of  its  execution 
can  be  controlled  from  the  Unix  tty  (which  will  be  distinct  from  the  console  keyboard 
device).  The  monitor  provides  the  following  facilities: 

Tracing  scheduler  actions.  The  user  can  independently  toggle  the  tracing  of  elemental  unit 
executions,  calendar  wrap-around  events,  and  calendar-switch  events. 

Single-stepping  calendars  or  elemental  units.  The  user  can  toggle  single  stepping 
through  each  elemental  unit  execution,  or  a  whole  calendar's  execution. 

Controlling  virtual-time  execution  speed.  The  user  can  control  the  speed  of  the 
application  in  two  ways.  First,  the  user  can  toggle  as-soon-as-possible  execution  of 
elemental  units,  called  asap  mode.  Second,  the  user  can  set  the  speed  at  which  virtual  limp. 
advances  relative  to  real  clock  time. 

Both  single-keystroke  and  command-line  operation.  All  monitor  switches  may  be 
toggled  with  a  single  keystroke  while  the  application  continues  running.  Also,  the  user  can 
enter  a  command-line  mode  in  which  various  parts  of  the  system  state  may  be  queried  and 
modified. 


5.3.1.  Controlling  Virtual  Time 

The  Maniti  monitor  contains  a  user-settable  speed  variable  which  determines  the  rate  at 
which  virtual  time  advances  relative  to  the  actual  clock  time. 

The  speed  may  be  set  to  any  floating  point  value  greater  than  zero.  Thus  virtual 
speed  may  be  set  to  run,  for  example,  five  times  faster  than  clock  time  (speed  =  5)  or  at 
four  times  slower  (speed  =  0.25).  The  speed  is  logically  limited  on  the  side  by  the 
utilization  of  the  CPU.  The  execution  of  application  code  can  not  be  sped  up,  only  the  idle 
time  between  executions. 

Idle  time  can  be  eliminated  completely  by  turning  on  as-soon-as-possible 
scheduling  of  elemental  unit  ( asap-mode ).  In  asap-mode  the  virtual  time  is  advanced  to 
the  start  time  of  the  next  elemental  unit  as  soon  as  the  previous  one  completes,  resulting 
in  the  execution  of  all  EUs  in  immediate  succession.  Asap-mode  is  separate  from  the 
speed  variable  -  it  can  be  toggled  independently,  and  when  turned  off,  scheduling 
continues  at  the  previously  set  speed. 

5.3.2.  Single-Keystroke  Operation 

The  following  keys  are  active  from  the  Unix  tty  session  (not  the  console  keyboard)  while 
the  application  is  running: 
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?  shows  the  list  of  keystrokes  and  current  values  for  the  toggle  switches, 
a  toggle  as-soon-as-possible  mode, 
e  toggle  elemental  unit  tracing, 
c  toggle  calendar  tracing, 
x  toggle  calendar-switch  tracing, 
s  toggle  elemental  unit  single-stepping. 

S  toggle  calendar  single-stepping, 
q  quit  application  completely. 

<ESC>  stop  application  and  enter  command-line  mode. 

5.3.3.  Command-line  Operation 

The  following  commands  are  available  from  command  line  mode.  At  this  time,  command¬ 
line  mode  is  a  just  a  framework  with  just  a  few  commands.  More  commands  to  query  and 
set  the  system  state  are  envisioned  for  future  releases. 

help 

Get  a  list  of  command-line  mode  commands. 

quit 

Quit  the  application  completely, 
vars 

Show  all  user-settable  monitor  variables  and  their  values. 

speed  <value> 

Set  the  virtual-time  speed  to  value.  The  value  can  be  any  floating  point  value  greater  than 
zero. 

cstep  [onloff] 

Set  or  toggle  calendar  single-stepping. 

estep  [onloff] 

Set  or  toggle  eu  single-stepping. 

ctrace  [onloff] 

Set  or  toggle  calendar  tracing. 

etrace  [onloff] 

Set  or  toggle  eu  tracing. 

strace  [onloff] 

Set  or  toggle  calendar  switch  tracing. 
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Chapter  6 

Maruti  Tool  Reference 
6.1.  Maruti  Builder 


The  mbuild  program  automates  the  process  of  building  a  runnable  Maruti  application.  This 
involves  building  the  constituent  application  binaries,  integrating  and  scheduling  the 
application,  and  binding  the  application  with  the  desired  Maruti  runtime  flavor. 

Mbuild  is  normally  run  in  the  directory  in  which  the  application  config  file  and 
constituent  module  source  files  are  located.  It  will  automatically  find  the  config  file  by  its 
.cfg  extension,  read  it,  and  generate  a  makefile  that  builds  what  modules  it  finds  used  there, 
then  calls  the  other  Maruti  tools.  Mbuild  works  by  creating  an  obj  subdirectory,  and 
putting  all  output  files  there. 

If  there  is  more  than  one  config  file  in  the  current  directory,  the  desired  file  must  be 
specified  with  the  -f  <config  file>  option. 

The  user  may  optionally  customize  the  mbuild  actions  by  providing  an  Mbuild.inc 
file  in  the  current  directory.  This  file  will  be  included  into  die  makefile  generated  by 
mbuild.  In  addition  to  providing  additional  build  targets  and  dependency  lines,  the  user 
may  set  some  variables  to  modify  the  mbuild  actions  themselves: 

FLAVORS  Default:  ux+xll  ux+et4k  sa+et4k  The  list  of  runtime  flavors  with  which  to 
link  the  application. 

MPC  Default:  mpc.  The  program  executed  to  compile  MPL  programs.  Not  normally 
modified  by  users. 

MPCVJFLAGS  Default  <empty>.  Supplemental  flags  for  the  MPL  compiler.  Most 
GCC  flags  will  work  here.  Most  often  the  user  will  want  to  customize  the  include 
directories  with  -I  <dir>. 

CFG  Default:  cfg.  The  program  executed  to  interpret  the  MCL  config  file  and  integrate  the 
application.  Not  normally  modified  by  users. 

CFG\_FLAGS  Default  <empty>.  Supplemental  flags  for  the  MPC  integrator.  Not 
normally  modified  by  users. 

ALLOCATOR  Default:  allocator.  The  program  executed  to  allocate  and  schedule  the 
application.  Not  normally  modified  by  users. 

ALLOCATOR\_FLAGS  Default:  -p  1.  Flags  for  the  Allocator.  See  section  6.4.  on  the 
Allocator  below  for  more  details. 

MBIND  Default  mbind.  The  program  executed  for  binding  the  application  and  runtime 
system.  Not  normally  modified  by  the  users. 

MBIND\_FLAGS  Default: <empty>.  Rags  for  the  Mbind  program.  Not  normally 
modified  by  users. 
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62.  MPL/C  Compiler 

The  MPL/C  compiler  (mpc)  consists  of  a  modified  gcc  plus  some  attendant  scripts  to  post- 
process  the  compiler  output.  It  generates  a  .0  file  for  a  module,  plus  a  .eul  file  containing  a 
partial  elemental-unit  graph  to  be  read  by  the  integrator. 

The  mpc  program  will  accept  GCC  command-line  options.  See  the  gcc(l)  manual 
page  for  details  on  the  available  options.  The  most  commonly  used  option  will  be  -I  dir  to 
customize  the  include  directories. 

63.  MPL/C  Integrator 


The  MCL  Integrator  (cfg)  reads  the  application  config  file  (appname.cfg)  and  all  the  module 
elemental-unit  graph  files  (i modulename.eul ),  then  generates  and  checks  all  the  jobs,  tasks, 
threads,  and  connections  for  the  application.  It  outputs  a  loader  map  file  ( appname.ldf }), 
and  a  complete  application  elemntal-unit  graph  annotated  with  allocation  and  scheduling 
constraints  and  communication  parameters  ( appname.sch ).  There  are  no  cfg  options 
normally  used. 

6.4.  Allocator/Scheduler 

The  Allocator/Scheduler  (allocator)  attempts  to  find  a  valid  allocation  for  the  application 
tasks  across  the  nodes  of  the  network,  and  a  valid  schedule  for  each  node  and  for  the 
network  bus.  The  allocation  and  schedules  are  considered  valid  if  all  allocation, 
communication,  and  scheduling  constraints  are  met 

The  allocator/scheduler  stops  when  a  valid  allocation  and  schedule  is  found,  or 
when  it  is  determined  that  one  cannot  be  found.  There  is  no  attempt  to  load-balance  the 
nodes  or  minimize  network  communications  beyond  what  is  needed  for  a  minimally  valid 
schedule.  The  allocator  outputs  an  allocation  information  file  {appname.alloc)  and  calendar 
schedules  file  ( appname.cal ). 

The  allocator  takes  two  flags: 

-p  <number  of  processors>  Default  1.  The  number  of  processors  in  the  target  system. 
It  should  match  the  number  of  nodes  defined  in  the  config  file. 

-t  ctdma  slot  size>  Default  1000.  The  Time  Division  Multiplexed  Access  (TDMA)  slot 
size  for  the  network  bus.  This  is  the  time,  in  microseconds,  that  each  node  will  be  alloted  to 
transmit  on  the  network.  All  the  nodes  get  a  TDMA  slot  in  turn.  The  tdma  slot  size  should 
stay  between  1000  and  16000  microseconds,  depending  on  the  application's  latency 
requirements  and  the  network  hardware's  buffering  capacity. 

63.  Maruti  Binder 

The  Maruti  Binder  (mbind)  reads  in  the  loader  map  .Idf  file,  the  allocation  .alloc  file,  and 
the  calendars  .cal  file,  and  generates  the  static  data  structures  needed  by  the  runtime  system 
(appname-globals. c).  It  also  generates  a  makefile  ( appname-bind.mk )  that  manages  the 
linking  of  each  task  of  the  application  within  its  own  logical  address  space,  then  linking  all 
tasks  together  with  the  various  flavors  of  the  runtime  library. 
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6.6.  Timing  Trace  Analyzer 


The  Timing  Trace  Analyser  (timestat)  takes  a  list  of  timing  output  files  as  generated  by 
the  runtime  system  and  generates  a  .wcet  file  that  contains  the  worst  case  execution 
times  for  the  elemental  units,  as  needed  by  the  allocator.  Timestat  also  prints  other 
statistics  generated  by  the  runtime  system. 


6.7.  Timing  Stats  Monitor 

Timing  information  is  output  from  a  stand-alone  Maruti  system  through  a  serial  port  when 
the  application  terminates.  The  mgettimes  program,  running  on  another  computer 
connected  to  the  other  end  of  that  serial  line,  will  receive  the  timing  data  and  store  it  in  a  file 
suitable  for  processing  by  timestat.  Mgettimes  can  process  the  output  of  multiple  runs  on 
the  test  setup,  even  from  different  applications.  Simply  leave  the  program  running  and  any 
data  that  is  received  will  be  saved. 

Mgettimes  is  called  as  follows: 

mgettimes  <speed>  <serial-port> 

where  <speed>  is  the  communications  rate  at  which  the  times  will  be  output  (19200  bps  in 
the  default  core),  and  <seiial-port>  is  the  device  file  for  the  communications  port  (for 
example,  /dev/ttyOO  for  the  PC's  COM1  port). 


Maruti  3.1 
Design  Overview 
First  Edition 


Systems  Design  and  Analysis  Group 
Department  of  Computer  Science 
University  of  Maryland  at  College  Park 


1.  Introduction 

Many  complex,  mission  critical  systems  depend  not  only  on  correct  functional  behavior, 
but  also  on  correct  temporal  behavior.  These  systems  are  called.  The  most  critical  systems 
in  this  domain  are  those  which  must  support  applications  with  hard  real-time  constraints, 
in  which  missing  a  deadline  may  cause  a  fatal  error.  Due  to  their  criticality,  jobs  with  hard 
real-time  constraints  must  always  execute  satisfying  the  user  specified  timing  constraints, 
despite  the  presence  of  faults  such  as  site  crashes  or  link  failures. 

A  real-time  operating  system,  besides  having  to  support  most  functions  of  a 
conventional  operating  system,  carries  the  extra  burden  of  guaranteeing  that  the  execution 
of  its  requested  jobs  will  satisfy  their  timing  constraints.  In  order  to  carry  out  real-time 
processing,  the  requirements  of  the  jobs  have  to  be  specified  to  the  system,  so  that  a 
suitable  schedule  can  be  made  for  the  job  execution.  Thus,  conventional  application 
development  techniques  must  be  enhanced  to  incorporate  support  for  specification  of  timing 
and  resource  requirements.  Further,  tools  must  be  made  available  to  extract  these 
requirements  from  the  application  programs,  and  analyze  them  for  schedulability. 

Based  on  the  characteristics  of  its  jobs,  a  real-time  system  can  be  classified  as  static, 
dynamic  or.  In  a  static  system,  all  (hard  real-time)  jobs  and  their  execution  characteristics 
are  known  ahead  of  time,  and  thus  can  be  statically  analyzed  prior  to  system  operation. 
Many  such  systems  are  built  using  the  cyclic  executive  or  static  priority  architecture.  In 
contrast,  there  are  many  systems  in  which  new  processing  requests  may  be  made  while  the 
system  is  in  operation.  In  a  dynamic  system,  new  requests  arrive  asynchronously  and  must 
be  processed  immediately.  However,  since  new  requests  demand  immediate  attention,  such 
systems  must  either  have  “soft”  constraints,  or  be  lightly  loaded  and  rely  on  exception 
mechanisms  for  violation  of  timing  constraints.  In  contrast,  reactive  systems  have  certain 
lead  time  to  decide  whether  or  not  to  accept  a  newly  arriving  processing  request.  Due  the 
presence  of  the  lead  time,  a  reactive  system  can  carry  out  analysis  without  adversely 
affecting  the  schedulability  of  currently  accepted  requests.  If  adequate  resources  are 
available  then  the  job  is  accepted  for  execution.  On  the  other  hand,  if  adequate  resources 
are  not  available  then  the  job  is  rejected  and  does  not  execute.  The  ability  to  reject  new  jobs 
distinguishes  a  reactive  system  from  a  completely  dynamic  system. 

The  purpose  of  the  Maruti  project  is  to  create  an  environment  for  the  development 
and  deployment  of  critical  applications  with  hard  real-time  constraints  in  a  reactive 
environment  Such  applications  must  be  able  to  execute  on  a  platform  consisting  of 
distributed  and  heterogeneous  resources,  and  operate  continuously  in  the  presence  of 
faults. 

The  Maruti  project  started  in  1988.  The  first  version  of  the  system  was  designed  as 
an  object-oriented  system  with  suitable  extensions  for  objects  to  support  real-time 
operation.  The  proof-of-concept  version  of  this  design  was  implemented  to  run  on  top  of 
the  Unix  operating  system  and  supported  hard  and  non-real-time  applications  running  in  a 
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distributed,  heterogeneous  environment.  The  feasibility  of  the  fault  tolerance  concepts 
incorporated  in  the  design  of  Maruti  system  were  also  demonstrated.  No  changes  to  the 
Unix  kernel  were  made  in  that  implementation,  which  was  operational  in  1990.  We  realized 
that  Unix  is  not  a  very  hospitable  host  for  real-time  applications,  as  very  little  control  over 
the  use  of  resources  can  be  exercised  in  that  system  without  extensive  modifications  to  the 
kernel.  Therefore,  based  on  the  lessons  learned  from  the  first  design,  we  proceeded  with 
the  design  of  the  current  version  of  Maruti  and  changed  the  implementation  base  to  CMU 
Mach  which  permitted  a  more  direct  control  of  resources. 

Most  recently,  we  have  implemented  Maruti  directly  on  486  PC  hardware, 
providing  Maruti  applications  total  control  over  resources.  The  initial  version  of  the 
distributed  Maruti  has  also  been  implemented,  allowing  Maruti  applications  to  run  across  a 
network  in  a  synchronized,  hard  real-time  manner. 

In  this  paper,  we  summarize  the  design  philosophy  of  the  Maruti  system  and 
discuss  the  design  and  implementation  of  Maruti.  We  also  present  the  development  tools 
and  operating  system  support  for  mission  critical  applications.  While  the  system  is  being 
designed  to  provide  integrated  support  for  multiple  requirements  of  mission  critical 
applications,  we  focus  our  attention  on  real-time  requirements  on  a  single  processor 
system. 


2.  Maruti  Design  Goals 

The  design  of  a  real-time  system  must  take  into  consideration  the  primary  characteristics  of 
the  applications  which  are  to  be  supported.  The  design  of  Maruti  has  been  guided  by  the 
following  application  characteristics  and  requirements. 

•  Real-Time  Requirements.  The  most  important  requirement  for  real-time  systems  is 
the  capability  to  support  the  timely  execution  of  applications.  In  contrast  with  many 
existing  systems,  the  next-generation  systems  will  require  support  for  hard,  soft,and 
non-real-time  applications  on  the  same  platform. 

•  Fault  Tolerance.  Many  mission-critical  systems  are  safety-critical,  and  therefore  have 
fault  tolerance  requirements.  In  this  context,  fault  tolerance  is  the  ability  of  a  system  to 
support  continuous  operation  in  the  presence  of  faults. 

Although  a  number  of  techniques  for  supporting  fault-tolerant  systems  have  been  suggested 
in  the  literature,  they  rarely  consider  the  real-time  requirements  of  the  system.  A  real-time 
operating  system  must  provide  support  for  fault  tolerance  and  exception  handling 
capabilities  for  increased  reliability  while  continuing  to  satisfy  the  timing  requirements. 

•  Distributivity.  The  inherent  characteristics  of  many  systems  require  that  multiple 
autonomous  computers,  connected  through  a  local  area  network,  cooperate  in  a 
distributed  manner.  The  computers  and  other  resources  in  the  system  may  be 
homogeneous  or  heterogeneous.  Due  to  the  autonomous  operation  of  the  components 
which  cooperate,  system  control  and  coordination  becomes  a  much  more  difficult  task 
than  if  the  system  were  implemented  in  a  centralized  manner.  The  techniques  learned  in 
the  design  and  implementation  of  centralized  systems  do  not  always  extend  to 
distributed  systems  in  a  straightforward  manner. 
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•  Scenarios.  Many  real-time  applications  undergo  different  modes  of  operation  during 
their  life  cycle.  A  scenario  defines  the  set  of  jobs  executing  in  the  system  at  any  given 
time.  A  hard  real-time  system  must  be  capable  of  switching  from  one  scenario  to 
another,  maintaining  the  system  in  a  safe  and  stable  state  at  all  times,  without  violating 
the  timing  constraints. 

•  Integration  of  Multiple  Requirements.  The  major  challenge  in  building  operating 
systems  for  mission  critical  computing  is  the  integration  of  multiple  requirements. 
Because  of  the  conflicting  nature  of  some  of  the  requirements  and  the  solutions 
developed  to  date,  integration  of  all  the  requirements  in  a  single  system  is  a  formidable 
task.  For  example,  the  real-time  requirements  preclude  the  use  of  many  of  the  fault¬ 
handling  techniques  used  in  other  fault-tolerant  systems. 


3.  Design  Approach  and  Principles 

Maruti  is  a  time-based  system  in  which  the  resources  are  reserved  prior  to  execution. 
Resource  reservation  is  done  on  the  time-line,  thus  allowing  for  reasoning  about  real-time 
properties  in  a  natural  way.  The  time-driven  architecture  provides  predictable  execution  for 
real-time  systems,  a  necessaiy  requirement  for  critical  applications  requiring  hard  real-time 
performance.  The  basic  design  approach  is  outlined  below: 

•  Resource  Reservation  for  Hard  Real-Time  Jobs.  Hard  real-time  applications  in 
Maruti  have  advance  resource  reservation  resulting  in  a  priori  guarantees  about  the 
timely  execution  of  hard  real-time  jobs.  This  is  achieved  through  a  calendar  data 
structure  which  keeps  track  of  all  resource  reservations  and  the  assigned  time  intervals. 
The  resource  requirements  are  specified  as  early  as  possible  in  the  development  stage  of 
an  application  and  are  manipulated,  analyzed,  and  refined  through  all  phases  of 
application  development 

•  Predictability  through  Reduction  of  Resource  Contention.  Hard  real-time  jobs  are 
scheduled  using  a  time-driven  scheduling  paradigm  in  which  the  resource  contention 
between  jobs  is  eliminated  through  scheduling.  This  results  in  reduced  runtime 
overheads  and  leads  to  a  high  degree  of  predictability.  However,  not  all  jobs  can  be 
pre-scheduled.  Since  resources  may  be  shared  between  jobs  in  the  calendar  and  other 
jobs  in  the  system,  such  as  non-real-time  activities,  there  may  be  resource  contention 
leading  to  lack  of  predictability.  This  is  countered  by  eliminating  as  much  resource 
contention  as  possible  and  reducing  it  whenever  it  is  not  possible  to  eliminate  it  entirely. 
The  lack  of  predictability  is  compensated  for  by  allowing  enough  slack  in  the  schedule. 

•  Integrated  Support  for  Fault  Tolerance.  Fault  tolerance  objectives  are  achieved  by 
integrating  the  support  for  fault  tolerance  at  all  levels  in  the  system  design.  Fault 
tolerance  is  supported  by  early  fault  detection  and  handling,  resilient  application 
structures  through  redundancy,  and  the  capability  to  switch  modes  of  operation.  Fault 
detection  capabilities  are  integrated  into  the  application  during  its  development, 
permitting  the  use  of  application  specific  fault  detection  and  fault  handling.  As  fault 
handling  may  result  in  violation  of  temporal  constraints,  replication  is  used  to  make  the 
application  resilient.  Failure  of  a  replica  may  not  affect  the  timely  execution  of  other 
replicas  and,  thereby,  the  operation  of  the  system  it  may  be  controlling.  Under 
anticipated  load  and  failure  conditions,  it  may  become  necessary  for  the  system  to 
revoke  the  guarantees  given  to  the  hard  real-time  applications  and  change  its  mode  of 
operation  dynamically  so  that  an  acceptable  degraded  mode  of  operation  may  continue. 
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•  Separation  of  Mechanism  and  Policy.  In  the  design  of  Maruti,  an  emphasis  has  been 
placed  on  separating  mechanism  from  policy.  Thus,  for  instance,  the  system  provides 
basic  dispatching  mechanisms  for  a  time-driven  system,  keeping  the  design  of  specific 
scheduling  policies  separate.  The  same  approach  is  followed  in  other  aspects  of  the 
system.  By  separating  the  mechanism  from  the  policy,  the  system  can  be  tailored  and 
optimized  to  different  environments. 

•  Portability  and  Extensibility.  Unlike  many  other  real-time  systems,  the  aim  of  the 
Maruti  project  has  been  to  develop  a  system  which  can  be  tailored  to  use  in  a  wide 
variety  of  situations— from  small  embedded  systems  to  complex  mission-critical 
systems.  With  the  rapid  change  in  hardware  technology,  it  is  imperative  that  the  design 
be  such  that  it  is  portable  to  different  platforms  and  makes  minimal  assumptions  about 
the  underlying  hardware  platform.  Portability  and  extensibility  is  also  enhanced  by 
using  modular  design  with  well  defined  interfaces.  This  allows  for  integration  of  new 
techniques  into  the  design  with  relative  ease. 

•  Support  of  Hard,  Soft,  and  Non-Real-Time  in  the  Same  Environment.  Many 
critical  systems  consist  of  applications  with  a  mix  of  hard,  soft,  and  non-real-time 
requirements.  Since  they  may  be  sharing  data  and  resources,  they  must  execute  within 
the  same  environment.  The  approach  taken  in  Maruti  is  to  support  the  integrated 
execution  of  applications  with  multiple  requirements  by  reducing  and  bounding  the 
unpredictable  interaction  between  them. 

•  Support  for  Distributed  Operation.  Many  embedded  systems  require  several 
processors.  When  multiple  processors  function  autonomously,  their  use  in  hard  real¬ 
time  applications  requires  operating  system  support  for  coordinated  resource 
management  Maruti  provides  coordinated,  time-based  resource  management  of  all 
resources  in  a  distributed  environment  including  the  processors  and  the  communication 
channels. 

•  Support  for  Multiple  Execution  Environments.  Maruti  provides  support  for 
multiple  execution  environments  to  facilitate  program  development  as  well  as  execution. 
Real-time  applications  may  execute  in  the  Maruti/Mach  or  Maruti/Standalone 
environments  and  maintain  a  high  degree  of  temporal  determinacy.  The 
Maruti/Standalone  environment  is  best  suited  for  the  embedded  applications  while 
Maruti/Mach  permits  the  concurrent  execution  of  hard  real-time  and  non-real-time  Unix 
applications.  In  addition,  the  Marati/Virtual  environment  has  been  designed  to  aid  the 
development  of  real-time  applications.  In  this  environment  the  same  code  which  runs  in 
the  other  two  environments  can  execute  while  access  to  all  Unix  debugging  tools  is 
available.  In  this  environment,  temporal  accuracy  is  maintained  with  respect  to  a  virtual 
real-time. 

•  Support  for  Temporal  Debugging.  When  an  application  executes  in  the 
Marati/Virtual  environment  its  interactions  are  carried  out  with  respect  to  virtual  real¬ 
time  which  is  under  the  control  of  the  user.  The  user  may  speed  it  up  with  respect  to 
actual  time  or  slow  it  down.  The  virtual  time  may  be  paused  at  any  instant  and  the 
debugging  tools  used  to  examine  the  state  of  the  execution.  In  this  way  we  may  debug 
an  application  while  maintaining  all  temporal  relationships,  a  process  we  call  temporal 
debugging. 
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4.  Application  Development  Environment 


To  support  applications  in  a  real-time  system,  conventional  application  development 
techniques  and  tools  must  be  augmented  with  support  for  specification  and  extraction  of 
resource  requirements  and  timing  constraints.  The  application  development  system 
provides  a  set  of  programming  tools  to  support  and  facilitate  the  development  of  real-time 
applications  with  diverse  requirements.  The  Maruti  Programming  Language  (MPL)  is  used 
to  develop  individual  program  modules.  The  Maruti  Configuration  Language  (MCL)  is 
used  to  specify  how  individual  program  modules  are  to  be  connected  together  to  form  an 
application  and  the  details  of  the  hardware  platform  on  which  the  application  is  to  be 
executed. 

4.1.  Maruti  Programming  Language 

Rather  than  develop  completely  new  programming  languages,  we  have  taken  the  approach 
of  using  existing  languages  as  base  programming  languages  and  augmenting  them  with 
Maruti  primitives  needed  to  provide  real-time  support 

In  the  current  version,  the  base  programming  language  used  is  ANSI  C.  MPL  adds 
modules,  shared  memory  blocks,  critical  regions,  typed  message  passing,  periodic 
functions,  and  message-invoked  functions  to  the  C  language.  To  make  analyzing  the 
resource  usage  of  programs  feasible,  certain  C  idioms  are  not  allowed  in  MPL;  in 
particular,  recursive  function  calls  are  not  allowed  nor  are  unbounded  loops  containing 
externally  visible  events,  such  as  message  passing  and  critical  region  transitions. 

•  The  code  of  an  application  is  divided  into  modules.  A  module  is  a  collection  of 
procedures,  functions,  and  local  data  structures.  A  module  forms  an  independently 
compiled  unit  and  may  be  connected  with  other  modules  to  form  a  complete  application. 
Each  module  may  have  an  initialization  function  which  is  invoked  to  initialize  the 
module  when  it  is  loaded  into  memory.  The  initialization  function  may  be  called  with 
arguments. 

•  Communication  primitives  send  and  receive  messages  on  one-way  typed  channels. 
There  are  several  options  for  defining  channel  endpoints  that  specify  what  to  do  on 
buffer  overflow  or  when  no  message  is  in  the  channel.  The  connection  of  two  end¬ 
points  is  done  in  the  MCL  specification  for  the  application-Maruti  insures  that  end¬ 
points  are  of  the  same  type  and  are  connected  properly  at  runtime. 

•  Periodic  functions  define  entry  points  for  execution  in  the  application.  The  MCL 
specification  for  the  application  will  determine  when  these  functions  execute. 

•  Message-invoked  functions,  called  services,  are  executed  whenever  messages  are 
received  on  a  channel. 

•  Shared  memory  blocks  can  be  declared  inside  modules  and  are  connected  together  as 
specified  in  the  MCL  specifications  for  the  application. 

•  An  action  defines  a  sequence  of  code  that  denotes  an  externally  observable  action  of  the 
module.  Actions  are  used  to  specify  timing  constraints  in  the  MCL  specification. 
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•  Critical  Regions  are  used  to  safely  access  and  maintain  data  consistency  between 
executing  entities.  Maruti  ensures  that  no  two  entities  are  scheduled  to  execute  inside 
their  critical  regions  at  the  same  time. 

4.2.  Maruti  Configuration  Language 

MPL  Modules  are  brought  together  into  as  an  executable  application  by  a  specification  file 
written  in  the  Maruti  Configuration  Language  (MCL).  The  MCL  specification  determines 
the  application's  hard  real-time  constraints,  the  allocation  of  tasks,  threads,  and  shared 
memory  blocks,  and  all  message-passing  connections.  MCL  is  an  interpreted  C-like 
language  rather  than  a  declarative  language,  allowing  the  instantiation  of  complicated 
subsystems  using  loops  and  subroutines  in  die  specification.  The  key  features  of  MCL 
include: 

•  Tasks,  Threads,  and  Channel  Binding.  Each  module  may  be  instantiated  any 
number  of  times  to  generate  tasks.  The  threads  of  a  task  are  created  by  instantiating  the 
entries  and  services  of  the  corresponding  module.  An  entry  instantiation  also  indicates 
the  job  to  which  the  entry  belongs.  A  service  instantiation  belongs  to  the  job  of  its 
client  The  instantiation  of  a  service  or  entry  requires  binding  the  input  and  output  ports 
to  a  channel.  A  channel  has  a  single  input  port  indicating  the  sender  and  one  or  more 
output  ports  indicating  the  receivers.  The  configuration  language  uses  channel  variables 
for  defining  the  channels.  The  definition  of  a  channel  also  includes  the  type  of 
communication  it  supports,  i.e.,  synchronous  or  asynchronous. 

•  Resources.  All  global  resources  (i.e.,  resources  which  are  visible  outside  a  module)  are 
specified  in  the  configuration  file,  along  with  the  access  restrictions  on  the  resource. 
The  configuration  language  allows  for  binding  of  resources  in  a  module  to  the  global 
resources.  Any  resources  use  by  a  module  which  are  not  mapped  to  a  global  resource 
are  considered  local  to  the  module. 

•  Timing  Requirements  and  Constraints.  These  are  used  to  specify  the  temporal 
requirements  and  constraints  of  the  program.  An  application  consists  of  a  set  of 
cooperating  jobs.  A  job  is  a  set  of  entries  (and  the  services  called  by  the  entries)  which 
closely  cooperate.  Associated  with  each  job  are  its  invocation  characteristics,  i.e., 
whether  it  is  periodic  or  aperiodic.  For  a  periodic  job,  its  period  and,  optionally,  the 
ready  time  and  deadline  within  the  period  are  specified.  The  constraints  of  a  job  apply 
to  ail  component  threads.  In  addition  to  constraints  on  jobs  and  threads,  finer  level 
timing  constraints  may  be  specified  on  the  observable  actions.  An  observable  action 
may  be  specified  in  the  code  of  the  program.  For  any  observable  action,  a  ready  time 
and  a  deadline  may  be  specified.  These  are  relative  to  the  job  arrival.  An  action  may  not 
start  executing  before  the  ready  time  and  must  finish  before  the  deadline.  Each  thread  is 
an  implicitly  observable  action,  and  hence  may  have  a  ready  time  and  a  deadline. 

Apart  from  the  ready  time  and  deadline  constraints,  programs  in  Maruti  can  also  specify 
relative  timing  constraints,  those  which  constrain  the  interval  between  two  events.  For  each 
action,  the  start  and  end  of  the  action  mark  the  observable  events.  A  relative  constraint  is 
used  to  constrain  the  temporal  separation  between  two  such  events.  It  may  be  a  relative 
deadline  constraint  which  specifies  the  upper  bound  on  time  between  two  events,  or  a  delay 
constraint  which  specifies  the  lower  bound  on  time  between  the  occurrence  of  the  two 
events.  The  interval  constraints  are  closer  to  the  event-based  real-time  specifications,  which 
constrain  the  minimum  and/or  maximum  distance  between  two  events  and  allow  for  a  rich 
expression  of  timing  constraints  for  real-time  programs. 
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•  Replication  and  Fault  Tolerance.  At  the  application  level  fault  tolerance  is  achieved  by 
creating  resilient  applications  by  replicating  parts,  or  all,  of  the  application.  The 
configuration  language  eases  the  task  of  achieving  fault  tolerance  by  allowing 
mechanisms  to  replicate  the  modules,  and  services,  thus  achieving  the  desired  amount 
of  resiliency.  By  specifying  allocation  constraints,  a  programmer  can  ensure  that  the 
replicated  modules  are  executed  on  different  partitions. 


5.  Analysis  and  Resource  Allocation 

This  phase  involves  analyzing  the  resource  allocation  and  scheduling  of  a  collection  of 
applications  in  terms  of  their  real-time  and  fault-tolerance  properties.  The  properties  of  the 
system  are  analyzed  with  respect  to  the  system  configuration  and  the  characteristics  of  the 
runtime  system,  and  resource  calendars  are  generated. 

The  analysis  phase  converts  the  application  program  into  fine-grained  segments 
called  elemental  units.  All  subsequent  analysis  and  resource  allocation  are  based  on  EUs. 

5.1.  Elemental  Unit  Model 

The  basic  building  block  of  the  Maruti  computation  model  is  the  elemental  unit  (EU).  In 
general,  an  elemental  unit  is  an  executable  entity  which  is  triggered  by  incoming  data  and 
signals,  operates  on  the  input  data,  and  produces  some  output  data  and  signals.  The 
behavior  of  an  EU  is  atomic  with  respect  to  its  environment  Specifically: 

•  All  resources  needed  by  an  elemental  unit  are  assumed  to  be  required  for  the  entire 
length  of  its  execution. 

•  The  interaction  of  an  EU  with  other  entities  of  the  systems  occurs  either  before  it  starts 
executing  or  after  it  finishes  execution. 

The  components  of  an  EU  are  illustrated  in  Figure  1  and  are  described  below: 


input  data/signals 


Figure  1:  Structure  of  an  Elemental  Unit 
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•  Input  and  Output  Ports.  Each  EU  may  have  several  input  and/or  output  ports.  Each 
port  specifies  a  part  of  the  interface  of  the  EU.  The  input  ports  are  used  to  accept 
incoming  input  data  to  the  EU,  while  the  output  ports  are  used  for  feeding  the  output  of 
the  EU  to  other  entities  in  the  system. 

•  Input  and  Output  Monitors.  An  input  monitor  collects  the  data  from  the  input  ports, 
and  provides  it  to  the  main  body.  In  doing  so,  it  acts  as  a  filter,  and  may  also  be  used 
for  error  detection  and  debugging.  The  input  monitors  are  also  used  for  supporting 
different  triggering  conditions  for  the  EU.  Similar  to  input  monitors,  the  output 
monitors  act  as  filters  to  the  outgoing  data.  The  output  monitor  may  be  used  for  error 
detection  and  timingconstraint  enforcement.  The  monitors  may  be  connected  to  other 
EUs  in  the  system  and  may  send  (asynchronous)  messages  to  them  reporting  errors  or 
status  messages.  The  receiving  EU  may  perform  some  error-handling  functions. 


•  Main  Body.  The  main  body  accepts  the  input  data  from  the  input  monitor,  acts  on  it, 
and  supplies  the  output  to  die  output  monitor.  It  defines  the  functionality  provided  by 
the  EU. 

Annotated  with  an  elemental  unit  are  its  resource  requirements  and 
timingconstraints,  which  are  supplied  to  the  resource  schedulers.  The  resource  schedulers 
must  ensure  that  the  resources  are  made  available  to  the  EU  at  the  time  of  execution  and  that 
its  timing  constraints  are  satisfied. 

5.2.  Composition  of  EUs 

In  order  to  define  complex  executions,  the  EUs  may  be  composed  together  and  properties 
specified  on  the  composition.  Elemental  units  are  composed  by  connecting  an  output  port 
of  an  EU  with  an  input  port  of  another  EU.  A  valid  connection  requires  that  the  input  and 
output  port  types  are  compatible,  i.e.,  they  carry  the  same  message  type.  Such  a  connection 
marks  a  one-way  flow  of  data  or  control,  depending  on  the  nature  of  the  ports.  A 
composition  of  Eus  can  be  viewed  as  a  directed  acyclic  graph,  called  an  elemental  unit 
graph  (EUG),  in  which  the  nodes  are  the  EUs,  and  the  edges  are  the  connections  between 
EUs.  An  incompletely  specified  EUG  in  which  all  input  and  output  ports  are  not  connected 
is  termed  as  a  partial  EUG  (PEUG).  A  partial  EUG  may  be  viewed  as  a  higher  level  EU. 
In  a  complete  EUG,  all  input  and  output  ports  are  connected  and  there  are  no  cycles  in  the 
graph.  The  acyclic  requirement  comes  from  the  required  time  detenninacy  of  execution.  A 
program  with  unbounded  cycles  or  recursions  may  not  have  a  temporally  determinate 
execution  time.  Bounded  cycles  in  an  EUG  are  converted  into  an  acyclic  graph  by  loop 
unrolling. 

The  composition  of  EUs  supports  higher  level  abstractions  and  the  properties  associated 
with  them.  By  carefully  choosing  the  abstractions,  the  task  of  developing  applications  and 
ensuring  that  the  timing  and  other  operational  constraints  are  satisfied  can  be  greatly 
simplified.  In  Maruti,  we  have  chosen  the  following  abstractions: 

•  A  thread  is  a  sequential  composition  of  elemental  units.  It  has  a  sequential  flow  of 
control  which  is  triggered  by  a  message  to  the  first  EU  in  the  thread.  The  flow  of 
control  is  terminated  with  the  last  EU  in  the  thread.  Two  adjacent  EUs  of  a  thread  are 
connected  by  a  single  link  carrying  the  flow  of  control.  The  component  elemental  units 
may  receive  messages  or  send  messages  to  elemental  units  outside  the  thread.  All  EUs 
of  a  thread  share  the  execution  stack  and  processor  state. 
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•  A  job  is  a  collection  of  threads  which  cooperate  with  each  other  to  provide  some 
functionality.  The  partial  EUGs  of  the  component  threads  are  connected  together  in  a 
well  defined  manner  to  form  a  complete  EUG.  All  threads  within  a  job  operate  under  a 
global  timing  constraint  specified  for  the  job. 

5.3.  Program  Analysis 

Program  modules  are  independently  compiled.  In  addition  to  die  generation  of  the  object 
code,  compilation  also  results  in  the  creation  of  partial  EUGs  for  the  modules,  i.e.,  for  the 
services  and  entries  in  the  module,  as  well  as  the  extraction  of  resource  requirements  such 
as  stack  sizes  for  threads,  memory  requirements,  and  logical  resource  requirements. 

Invocation  of  an  entry  point  and  service  call  starts  a  new  thread  of  execution.  A 
control  flow  graph  is  generated  for  each  service  and  entry.  The  control  flow  graph  and  the 
MPL  primitives  are  used  to  delineate  EU  boundaries.  Note  that  an  EU  execution  is  atomic, 
i.e.,  all  resources  required  by  the  EU  are  assumed  to  be  used  for  the  entire  duration  of  its 
execution.  Further,  all  input  messages  are  assumed  to  be  logically  received  at  the  start  of  an 
EU  and  all  output  messages  are  assumed  to  be  logically  sent  at  the  end  of  an  EU.  At 
compilation  time,  the  code  for  each  entry  and  service  is  broken  up  into  one  or  more 
elemental  units.  The  delineation  of  EU  boundaries  is  done  in  a  manner  that  ensures  that  no 
cycles  are  formed  in  the  resultant  EUG.  Thus,  for  instance,  a  send  followed  by  a  receive 
within  the  same  EU  may  result  in  a  cyclic  precedence  and  must  be  prevented.  We  follow 
certain  rules  of  thumb  to  delineate  EU  boundaries,  which  may  be  overridden  and  explicitly 
changed  by  the  user.  The  EU  boundaries  are  created  at  a  receive  statement,  the  beginning 
and  end  of  a  resource  block,  and  the  beginning  and  end  of  an  observable  action.  For  each 
elemental  unit  a  symbolic  name  is  generated  and  is  used  to  identify  it  The  predecessors  and 
successors  of  the  EU  as  well  as  the  source  code  line  numbers  associated  with  the  EU  are 
identified  and  stored.  The  resource  and  timing  requirements  that  can  be  identified  during 
compilation  are  also  stored,  and  place  holders  are  created  for  the  remaining  information. 

Given  an  application  specification  in  the  Maruti  Configuration  Language  and  the 
component  application  modules,  the  integration  tools  are  responsible  for  creating  a 
complete  application  program  and  extracting  out  the  resource  and  timing  information  for 
scheduling  and  resource  allocation.  The  input  to  the  integration  process  are  the  program 
modules,  the  partial  EUGs  corresponding  to  the  modules,  the  application  configuration 
specification,  and  the  hardware  specifications.  The  outputs  of  the  integration  process  are:  a 
specification  for  the  loader  for  creating  tasks,  populating  their  address  spaces,  creating  the 
threads  ad  channels,  and  initializing  the  task;  loadable  executables  of  the  program;  and  the 
complete  application  EUG  along  with  the  resource  descriptions  for  the  resource  allocation 
and  scheduling  subsystem. 

5.4.  Communication  Model 

Maruti  supports  message  passing  and  shared  memory  models  for  communication. 

•  Message  Passing.  Maruti  supports  the  notion  of  one-way  message  passing  between 
elemental  units.  Message  passing  provides  a  location-independent  and  architecture- 
transparent  communication  paradigm.  A  channel  abstraction  is  used  to  specify  a  one 
way  message  communication  path  between  a  sender  and  a  receiver.  A  one-way 
message-passing  channel  is  set  up  by  declaring  the  output  port  on  the  sender  EU,  the 
input  port  on  the  receiver  EU,  and  the  type  of  message.  The  communication  is 
asynchronous  with  respect  to  the  sender,  i.e.,  the  sender  does  not  block. 
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•  Synchronous  Communication.  Synchronous  communication  is  used  for  lightly 
coupled  message  passing  between  elemental  units  of  the  same  job.  For  every  invocation 
of  the  sender  there  is  an  invocation  of  the  receiver  which  accepts  the  message  sent  by 
the  sender.  The  receiver  is  blocked  (de-scheduled)  until  message  arrival  under  normal 
circumstances.  The  messages  in  a  synchronous  communication  channel  are  delivered  in 
FIFO  order. 

•  Asynchronous  Communication.  Asynchronous  communication  may  be  used  for 
message  passing  between  elemental  units  not  belonging  to  the  same  job.  It  may  also  be 
used  between  real-time  and  non-real-time  jobs.  In  such  communication,  neither  the 
sender  nor  the  receiver  is  blocked  (i.e.,  there  is  no  synchronization).  Since  the  sender 
and  receiver  may  execute  at  different  rates,  it  is  possible  that  no  finite  amount  of  buffers 
suffice.  Hence,  an  asynchronous  communication  channel  is  inherently  lossy.  The 
receiver  may  specify  its  input  port  to  be  inFirst  or  inLast  to  indicate  which  messages  to 
drop  when  the  buffers  are  full.  The  first  message  is  dropped  in  an  inLast  channel,  while 
the  last  message  is  dropped  in  an  inFirst  channel. 

There  may  be  multiple  receivers  of  a  message,  thus  allowing  for  multi-cast  messages. 
Similar  to  a  one-to-one  channel,  a  multicast  channel  may  also  be  synchronous  or 
asynchronous.  All  receivers  of  a  multi-cast  message  must  be  of  the  same  type. 

•  Shared  Memory.  Shared  memory  is  also  supported  in  Maruti.  The  simplest  way  to 
share  memory  between  EUs  is  to  allow  them  to  exist  within  the  same  address  space. 
We  use  task  abstraction  for  this  purpose.  A  task  consists  of  multiple  threads  operating 
within  it,  sharing  the  address  space.  The  task  serves  as  an  execution  environment  for 
the  component  threads.  A  thread  may  belong  to  only  one  task.  In  addition  to  the  shared 
memory  within  a  task,  inter-task  sharing  is  also  supported  through  the  creation  of 
shared  memory  partitions.  A  shared  memory  partition  is  a  shared  buffer  which  can  be 
accessed  by  any  EU  permitted  to  do  so.  The  shared  memory  partitions  provide  an 
efficient  way  to  access  data  shared  between  multiple  EUs.  The  shared  memory 
communication  paradigm  provides  just  the  shared  memory  -  it  is  the  user's 
responsibility  to  ensure  safe  access  to  the  shared  data.  This  can  be  done  by  defining  a 
logical  resource  and  ensuring  that  the  resource  is  acquired  every  time  the  shared  data  is 
accessed.  By  providing  appropriate  restrictions  on  the  logical  resource,  safe  access  to 
data  can  be  ensured. 

5.5.  Resource  Model 

A  distributed  system  consists  of  a  collection  of  autonomous  processing  nodes  connected 
via  a  local  area  network.  Each  processing  node  has  resources  classified  as  processors, 
logical  resources,  and  peripheral  devices.  Logical  resources  are  used  to  provide  safe  access 
to  shared  data  structures  and  are  passive  in  nature.  The  peripheral  devices  include  sensors 
and  actuators.  Restrictions  may  be  placed  on  the  preemptability  of  resources  to  maintain 
resource  consistency.  The  type  of  the  resource  determines  the  restrictions  that  are  placed  on 
the  preemptability  of  the  resource  and  serves  to  identify  operational  constraints  for  the 
purpose  of  resource  allocation  and  scheduling.  We  classify  the  resources  into  the  following 
types  based  on  the  restrictions  that  are  imposed  on  their  usage. 
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•  Non-preemptable.  The  inherent  characteristics  of  a  resource  may  be  such  that  it 
prevents  preemptability,  i.e.,  any  usage  of  the  resource  must  not  be  preempted.  Many 
devices  require  non-preemptive  scheduling.  For  resources  which  require  the  use  of 
CPU,  this  implies  non-preemptive  execution  from  the  time  the  resource  is  acquired  until 
the  time  the  resource  is  released. 

•  Exclusive.  Unlike  a  non-preemptive  resource,  an  exclusive  resource  can  be  preempted. 
However,  the  resource  may  not  be  granted  to  anyone  else  in  the  meantime.  A  critical 
section  is  an  example  of  a  resource  which  must  be  used  in  exclusive  mode. 

•  Serially  Reusable.  A  serially  reusable  resource  can  not  only  be  preempted  but  may 
also  be  granted  to  another  EU.  The  state  of  such  resources  can  be  preserved  and 
restored  when  the  resource  is  granted  back. 

•  Shared.  A  shared  resource  may  be  used  by  multiple  entities  simultaneously.  In  a  single 
processor  system,  since  only  one  entity  is  executing  at  a  given  time,  there  is  no 
distinction  between  a  shared  resource  and  a  serially  reusable  resource. 

A  non-preemptable  resource  is  the  most  restrictive  and  a  shared  resource  is  the 
least  restrictive  in  terms  of  the  type  of  usage  allowed.  An  application  requesting  the  use  of  a 
resource  must  specify  when  the  resource  is  to  be  acquired,  when  it  is  to  be  released,  and 
the  restrictions  on  the  preemptability  of  the  resource.  The  resource  requirements  for 
applications  may  be  specified  at  different  levels  of  computational  abstractions  as  identified 
below. 

•  EU  level.  The  lowest  level  a  resource  requirement  can  be  specified  at  is  the  EU  level.  A 
resource  requirement  specified  at  the  EU  level  implies  that  the  resource  is  acquired  and 
released  within  the  EU.  For  scheduling  purposes,  it  is  assumed  that  the  resource  is 
required  for  the  entire  duration  of  the  execution  of  the  EU. 


•  Thread  Level.  Resource  specification  at  the  thread  level  is  used  for  resources  which 
are  acquired  and  released  by  different  EUs  belonging  to  the  same  thread.  For  instance, 
a  critical  section  may  be  acquired  in  one  EU  and  released  in  another  one. 

•  Job  Level.  Job-level  resource  specifications  are  used  to  specify  resources  which  are  not 
acquired  and  released  for  each  invocation  of  a  periodic  or  sporadic  job.  Instead,  these 
resources  are  acquired  at  the  job  initialization  and  released  at  job  termination.  For  a 
periodic  job,  an  implicit  resource  associated  with  each  thread  are  the  thread  data 
structures  (including  procesor  stack  and  registers). 

5.6.  Operational  Constraints 

The  execution  of  EUs  is  constrained  through  various  kinds  of  operational  constraints.  Such 
constraints  may  arise  out  of  restricted  resource  usage  or  through  the  operational 
requirements  of  the  application.  Examples  of  such  constraints  are:  precedence,  mutual 
exclusion,  ready  time,  and  deadline.  They  may  be  classified  into  the  following  categories: 

•  Synchronization  Constraints.  Synchronization  constraints  arise  out  of  data  and 
control  dependencies  or  through  resource  preemption  restrictions.  Typical  examples  of 
such  constraints  are  precedence  and  mutual  exclusion. 
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•  Timing  Constraints.  Many  types  of  timing  constraints  may  be  specified  at  different 
levels,  i.e.,  at  job  level,  thread  level,  or  EU  level.  At  the  job  level,  one  may  specify  the 
ready  time,  deadline,  and  whether  the  job  is  periodic,  sporadic,  or  aperiodic.  For 
threads,  a  ready  time  and  deadline  may  be  specified  relative  to  the  job  arrival.  Likewise, 
a  ready  time  and  deadline  may  be  specified  for  an  individual  EU.  We  also  support  the 
notion  of  relative  timing  constraints,  i.e.,  constraints  on  the  temporal  distance  between 
the  execution  of  two  EUs. 

•  Allocation  Constraints.  In  our  model,  tasks  are  allocated  to  processing  nodes. 
Allocation  constraints  are  used  to  restrict  the  task  allocation  decisions.  Allocation 
constraints  often  arise  due  to  fault-tolerance  requirements,  where  the  replicas  of  EUs 
must  be  allocated  on  different  processing  nodes.  Similarly,  when  two  tasks  share 
memory,  they  must  be  allocated  on  the  same  processing  node.  Sometimes  a  task  must 
be  bound  to  a  processing  node  since  it  uses  a  particular  resource  bound  to  the  node 
(e.g.,  a  sensor). 

The  operational  constraints  are  made  available  to  the  resource  allocation  and 
scheduling  tools  which  must  ensure  that  the  allocation  and  scheduling  maintains  the 
restrictions  imposed  by  the  constraints.  The  model  does  not  place  any  a  priori  restrictions 
on  the  nature  of  the  constraints  that  may  be  specified.  However,  the  techniques  used  by  the 
resource  allocator  and  scheduler  wall  depend  on  the  type  of  constraints  that  can  be 
specified. 

5.7.  Allocation  and  Scheduling 

After  the  application  program  has  been  analyzed  and  its  resource  requirements  and 
execution  constraints  identified,  it  can  be  allocated  and  scheduled  for  a  runtime  system. 

This  final  phase  of  program  development  depends  upon  the  physical  characteristics 
of  the  hardware  on  which  the  application  will  be  run,  for  example,  the  location  of  devices 
and  the  number  of  nodes  and  type  of  processors  on  each  node  in  the  distributed  system. 

Maruti  uses  time-based  scheduling  and  the  scheduler  creates  a  data  structure  called  a 
calendar  which  defines  the  execution  instances  in  time  for  all  executable  components  of 
the  applications  to  be  run  concurrently. 

We  consider  the  static  allocation  and  scheduling  in  which  a  task  is  the  finest 
granularity  object  of  allocation  and  an  EU  instance  is  the  unit  of  scheduling.  In  order  to 
make  the  execution  of  instances  satisfy  the  specifications  and  meet  the  timing  constraints, 
we  consider  a  scheduling  frame  whose  length  is  the  least  common  multiple  of  all  tasks' 
periods.  As  long  as  one  instance  of  each  EU  is  scheduled  in  each  period  within  the 
scheduling  frame  and  these  executions  meet  the  timing  constraints  a  feasible  schedule  is 
obtained. 

As  a  part  of  the  Maruti  development  effort,  a  number  of  scheduling  techniques  have 
been  developed  and  are  used  for  generating  schedules  and  calendars  for  task  sets.  These 
techniques  include  the  use  of  temporal  analysis  and  simulated  annealing.  Schedules  for 
single-processor  systems  as  well  as  multiple-processor  networks  are  developed  using  these 
techniques. 
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6.  Maruti  Runtime  System 

The  runtime  system  provides  the  conventional  functionality  of  an  operating  system  in  a 
maimer  that  supports  the  timely  dispatching  of  jobs.  There  are  two  major  components  of 
the  runtime  system  -  the  Maruti  core,  which  is  the  operating  system  code  that  implements 
scheduling,  message  passing,  process  control,  thread  control,  and  low  level  hardware 
control,  and  the  runtime  dispatcher,  which  performs  resource  allocation  and  scheduling  for 
dynamic  arrivals. 

6.1.  The  Dispatcher 

The  dispatcher  carries  out  the  following  tasks: 

•  Resource  Management  The  dispatcher  handles  requests  to  load  applications.  This 
involves  creating  all  the  tasks  and  threads  of  the  application,  reserving  memory,  and 
loading  the  code  and  data  into  memory.  All  the  resources  are  reserved  before  an 
application  is  considered  successfully  loaded  and  ready  to  run. 

•  Calendar  Management.  The  dispatcher  creates  and  loads  the  calendars  used  by 
applications  and  activates  them  when  the  application  run  time  arrives.  The  application 
itself  can  activate  and  deactivate  calendars  for  scenario  changes. 

•  Connection  Management.  A  Maruti  application  may  consist  of  many  different  tasks 
using  channels  for  communication.  The  dispatcher  sets  up  the  connections  between  the 
application  tasks  using  direct  shared  buffers  for  local  connections  or  a  shared  buffer 
with  a  communications  agent  for  remote  connections. 

•  Exception  Handling.  Rogue  application  threads  may  generate  exceptions  such  as 
missed  deadlines,  arithmetic  exceptions,  stack  overflows,  and  stray  accesses  to 
unreserved  memory.  These  exceptions  are  normally  handled  by  the  dispatcher  for  all 
the  Maruti  application  threads.  Various  exception  handling  behaviors  can  be 
configured,  from  terminating  the  entire  application  or  just  the  errant  thread,  to  simply 
invoking  a  task-specific  handler. 

6.2.  Core  Organization 

The  core  of  the  Maruti  hard  real-time  runtime  system  consists  of  three  data  structures: 

•  The  calendars  are  created  and  loaded  by  the  dispatcher.  Kernel  memory  is  reserved  for 
each  calendar  at  the  time  it  is  created.  Several  system  calls  serve  to  create,  delete, 
modify,  activate,  and  deactivate  calendars. 

•  The  results  table  holds  timing  and  status  results  for  the  execution  of  each  elemental 
unit.  The  maruti_calendar_results  system  call  reports  these  results  back  up  to  the  user 
level,  usually  to  the  dispatcher.  The  dispatcher  can  then  keep  statistics  or  write  a  trace 
file. 

•  The  pending  activation  table  holds  all  outstanding  calendar  activation  and  deactivation 
requests.  Since  the  requests  can  come  before  the  switch  time,  the  kernel  must  track  the 
requests  and  execute  diem  at  the  correct  time  in  the  correct  order. 
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The  scheduler  gains  control  of  the  CPU  at  every  clock  tick  interrupt.  At  that  time,  if 
a  Maruti  thread  is  currently  running  and  its  deadline  has  passed  its  execution  is  stopped  and 
an  exception  raised. 

If  any  pending  activations  are  due  to  be  executed  those  requests  are  handled, 
thereby  changing  the  set  of  active  calendars.  Then  the  next  calendar  entry  is  hecked  to  see  if 
it  is  scheduled  to  execute  at  this  time.  If  so,  the  scheduler  switches  immediately  to  the 
specified  thread.  If  no  hard  real-time  threads  are  scheduled  to  execute,  the  calendar 
scheduler  falls  through  to  the  soft  and  non-real-time,  priority-based  schedulers. 

Maruti  threads  indicate  to  the  scheduler  that  they  have  successfully  reached  the  end 
of  their  elemental  unit  with  the  maruti_unit_done  system  call.  This  call  marks  the  current 
calendar  entry  as  done  and  fills  in  the  time  actually  used  by  the  thread.  The  Maruti  thread  is 
then  suspended  until  it  next  appears  in  the  calendars.  Soft  and  non-real-time  threads  can  be 
run  until  the  next  calendar  entry  is  scheduled  and  are  executed  using  a  priority  based 
scheduling  for  the  available  time  slots. 

At  all  times  the  Maruti  scheduler  knows  which  calendar  entiy  will  be  the  next  one  to 
run  so  that  the  calendars  are  not  continually  searched  for  work.  This  is  recalculated  when 
maruti_unit_done  is  called  or  whenever  the  set  of  active  calendars  changes. 

6.3.  Multiple  Scenarios 

The  Maruti  design  includes  the  concept  of  scenarios,  implemented  at  runtime  as  sets  of 
alternative  calendars  that  can  be  switched  quickly  to  handle  an  emergency  or  a  change  in 
operating  mode.  These  calendars  are  pre-scheduled  and  able  to  begin  execution  without 
having  to  invoke  any  user-level  machinery.  The  dispatcher  loads  the  initial  scenarios 
specified  by  the  application  and  activates  one  of  them  to  begin  normal  execution.  However, 
the  application  itself  can  activate  and  deactivate  scenarios.  For  example,  an  application 
might  need  to  respond  instantaneously  to  the  pressing  of  an  emergency  shutdown  button.  A 
single  system  call  then  causes  the  immediate  suspension  of  normal  activity  and  the  running 
of  the  shutdown  code  sequence.  Calendar  activation  and  deactivation  commands  can  be 
issued  before  the  desired  switch  time.  The  requests  are  recorded  and  the  switches  occur  at 
the  precise  moment  specified.  This  allows  the  application  to  insure  smooth  transitions  at 
safe  points  in  the  execution. 


Figure  2:  Maruti  System  Architecture 


394 


7.  Maruti  3.1  System  Architecture 

Maruti  3.1,  the  current  version  of  the  operating  system,  implements  most  of  the  above 
design  with  a  series  of  development  tools  that  operate  in  a  Berkeley  Unix  development 
environment  (NetBSD  1.0)  on  IBM-compatible  486  or  Pentium  PCs.  Maruti  applications 
can  be  run  stand-alone  on  the  bare  hardware  or  under  a  Unix-based  debugging 
environment. 

•  MPL  code  is  processed  by  the  MPL  compiler,  a  modified  version  of  the  GNU  C 
compiler.  The  MPL  compiler  generates  both  the  compiled  object  code  and  partial  EUG 
file  that  contains  all  information  extracted  from  the  module  for  further  analysis, 
including  the  boundaries  of  the  elemental  units  of  the  program. 

•  The  application's  MCL  specification  is  read  and  interpreted  by  the  integrator.  The 
PEUG  file  describing  each  module  used  in  the  application  is  processed  and  intermodule 
type  checking  performed.  The  integrator  generates  a  file  specifying  the  full  application 
EUG,  allocation,  and  scheduling  constraints. 


•  The  allocator/scheduler  reads  in  the  data  supplied  by  the  integrator  and  a  description  of 
the  physical  system  on  which  to  allocate  the  application.  The  allocator  searches  for  an 
arrangement  of  elemental  units  on  the  nodes  of  the  network  that  satisfies  all  the  timing 
and  allocation  constraints,  considering  the  computation  times  for  each  elemental  unit.  If 
a  feasible  schedule  can  be  found,  a  calendar  file  for  each  resource  is  generated.  A 
loader  map  is  also  generated  which  describes,  for  the  runtime  system,  each  task, 
thread,  shared  memory  area,  and  communications  link  so  that  all  the  resources  can  be 
reserved  when  the  application  is  loaded. 

•  The  computation  time  analyzer  takes  timing  trace  information  generated  by  the  runtime 
system  and  generates  worst-case  execution  times  for  all  the  Eus  of  the  application.  This 
timing  information  can  be  used  in  subsequent  runs  of  the  scheduler  to  refine  die 
schedule  and  verify  its  feasibility  given  changes  in  computation  times.  Use  of  the 
timing  tool  during  testing  leads  to  very  high  confidence  in  the  schedule. 

7.1.  Runtime  Environments 

Compiled  and  analyzed  Maruti  applications  can  be  executed  in  multiple  runtime 

environments. 


•  The  Maruti/Virtual  runtime  environment  allows  the  debugging  of  Maruti  applications 
within  the  development  environment.  Applications  run  in  virtual  real-time  under  Unix, 
allowing  temporal  debugging,  including  single  stepping  the  real-time  calendars. 

•  The  Maruti/Mach  runtime  environment  is  a  modified  version  of  Mach  which  allows  the 
running  of  real-time  Maruti  programs  within  the  Mach  environment,  where  the  real-time 
and  non-real-time  task  can  co-exist  and  interact  in  the  same  host 

•  The  Maruti/Standalone  runtime  environment  runs  the  application  on  the  bare  hardware, 
suitable  for  embedded  systems.  The  application  is  linked  with  a  minimal  Maruti  core 
library  and  can  be  booted  directly. 
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7.2.  Maruti/Virtual  Runtime  Environment 


Testing  real-time  programs  in  their  native  embedded  environments  can  be  tedious  and  very 
time-consuming  because  of  the  lack  of  debugging  facilities  and  the  requirement  to  reload 
and  reboot  the  target  computer  every  time  a  change  is  made.  Maruti  provides  a  Unix-based 
runtime  system  that  allows  the  execution  of  Maruti  hard-real-time  applications  from  within 
the  Unix  development  environment.  This  Unix  execution  environment  supports  the 
following  features: 

•  The  Maruti  application  has  direct  control  of  its  I/O  device  hardware. 

•  Graphical  output  and  keyboard  input  can  go  either  to  the  PC  console,  as  in  the 
Maruti/Standalone  and  Maruti/Mach  environments,  or  appear  in  an  X  window  on  any 
Unix  workstation,  possibly  across  the  network. 

•  The  application  can  be  run  under  the  Unix  GNU  Debugger,  allowing  the  examination 
of  program  variables  and  stack  traces,  setting  of  breakpoints,  and  post-mortem 
analysis. 
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Figure  3:  Maruti/Virtual  screen  running  in  the  development  environment 


•  The  application  has  access  to  Unix  standard  output  so  it  can  print  debug  and  status 
messages  to  the  interactive  session  while  running. 

•  The  Maruti  application  runs  in  virtual  real-time ;  that  is,  it  sees  itself  running  in  hard- 
real-time  against  a  virtual  time  base. 
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•  The  virtual  time  can  be  manipulated  through  the  runtime  system  for  temporal 
debugging.  Virtual  time  can  be  slowed  down  or  sped  up,  and  individual  elemental  units 
(EU)  or  whole  calendars  can  be  single-stepped  or  traced. 

7.3.  Maruti/Standalone  Real-Time  Environment 

•  Maruti/Standalone  provides  a  minimal  runtime  system  for  the  execution  of  a  Maruti 
application  on  the  bare  hardware.  The  stand-alone  environment  has  the  following 
attributes: 

•  The  stand-alone  version  of  an  application  is  built  from  the  same  object  modules  as  are 
used  in  the  Unix  and  Maruti/Mach  execution  environments. 

•  All  the  modules  of  the  application  are  bound  with  only  those  routines  of  the  Maruti  core 
that  are  needed  into  one  executable,  suitable  for  booting  directly  or  converting  into 
ROM. 

•  The  application  has  complete  control  of  the  computer  hardware. 

•  The  application  runs  in  hard  real-time  with  very  low  overhead  and  variability. 

•  The  minimal  Maruti/Standalone  core  library  currently  consists  of  about  16  KB  of  code 
and  16  KB  of  data. 

•  The  optional  Maruti  Distributed  Operation  support  (including  network  driver)  is  about 
14  KB  of  code  and  9  KB  of  data. 

•  The  optional  Maruti  graphics  package  currently  consists  of,  for  the  standard  VGA 
version,  10  KB  of  code  and  20  KB  of  data  (plus  150K  for  a  secondary  frame  buffer  for 
best  performance). 

7.4.  Maruti/Mach  Real-Time  Environment 

The  original  execution  environment  for  Maruti-2  was  a  modified  version  of  the  CMU  Mach 
3.0  kernel.  Maruti/Mach  is  potentially  useful  in  hybrid  environments  in  which  the  real-time 
components  co-exist  with  Mach  and  Unix  processes  on  the  same  CPU.  Because  of 
preemptability  problems  in  CMU  Mach  we  will  not  be  distributing  Maruti/Mach  until  it  can 
be  rehosted  onto  OSF1/MK  real-time  kernels. 

The  Maruti/Mach  features  include  the  following: 

•  A  calendar-based  real-time  scheduler  has  been  added  to  the  CMU  Mach  3.0  kernel. 
This  scheduler  takes  precedence  over  the  existing  Mach  scheduler,  running  Maruti 
elemental  units  from  the  calendar  at  the  proper  release  time. 


•  The  Maruti  application  and  most  of  the  runtime  system  run  as  normal  Mach  user-level 
tasks  and  threads,  which  are  wired  down  in  memory. 
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•  The  Maruti  application  may  communicate  with  non-Maruti  Unix  and  Mach  processes 
through  shared  memory. 

•  The  Maruti/Mach  kernel  maintains  runtime  information  for  each  elemental  unit 
executed,  and  makes  that  information  available  to  the  user-level  code  for  worst-case 
computation  time  analysis. 

•  Parts  of  the  CMU  Mach  kernel  remain  unpreemptable.  Nevertheless,  on  a  dedicated 
system  we  can  achieve  release  time  variability  of  about  100  microseconds.  The  context 
switch  time  is  about  200  microseconds. 

•  The  new  release  of  OSF  Research  Institute  Mach  MK6.0  addresses  most  of  the  Mach 
kernel  preemptability  concerns.  We  will  be  porting  Maruti/Mach  to  this  base  in  the  near 
future. 


8.  Future  Directions 

The  Maruti  Project  is  an  ongoing  research  effort  We  hope  to  extend  the  current  system  in 
a  number  of  possible  directions.  Of  course,  since  this  is  a  research  project,  we  expect  our 
ideas  to  evolve  over  time  as  we  gain  experience  and  get  feedback  from  users. 

8.1.  Scheduling  and  Analysis  Extensions 

Preemptable  Scheduling  of  Hard-Real-Time  Tasks 

We  are  planning  to  extend  our  scheduling  approach  to  incorporate  controlled  preemptions 
of  tasks.  To  date  we  have  concentrated  on  using  non-preemptable  executions  of  tasks, 
which  simplifies  scheduling  and  eases  exclusion  problems  in  application  development 
However,  the  non-preemptability  assumption  to  exclusion  is  not  scalable  to  a 
multiprocessor,  as  threads  running  on  different  processors  can  interfere  with  each  other. 
Controlled  preemption  is  more  powerful,  as  it  allows  scheduling  of  long-running  tasks 
concurrently  with  high  frequency  tasks.  Preemption  will  remain  under  the  control  of  the 
application. 

Language  support  for  atomic  actions  will  be  developed  to  replace  the  assumption  of 
non-preemptable  EU's.  Action  statements  will  serve  to  delineate  sections  of  code  on  which 
precise  timing  requirements  can  be  imposed  by  the  application  designer.  Combined  with 
critical  region  statements  (already  implemented),  actions  will  allow  the  programmer  to 
specify  precisely  the  desired  timing  and  resource  interrelationships  in  a  manner  that  is 
scalable  to  a  multiprocessor  or  network  cluster,  unlike  the  non-preemptability  assumption. 

We  will  extend  the  Maruti  run-time  system  to  handle  preemptable  hard  real-time 
tasks.  This  will  be  done  in  coordination  with  the  analysis  tools  which  will  generate 
multiple  calendar  entries  for  the  preempted  EUs.  All  but  the  last  entry  for  the  EU  will  be 
marked  as  preemptable,  and  all  but  the  first  will  be  marked  as  continuation  entries.  This  is 
enough  information  for  the  run-time  scheduler  to  correctly  handle  the  preemption  in  a 
controlled  manner,  even  when  the  EU  completes  early. 
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Integration  of  Time-based  and  Priority-Based  Scheduling 

We  plan  to  integrate  the  time-based  and  priority-based  scheduling  in  a  single  framework. 
To  date  we  have  concentrated  on  time-based  scheduling  only.  To  support  other  scheduling 
paradigms  within  the  time-based  framework,  we  may  reserve  time  slots  in  the  schedule  and 
associate  a  queue  of  waiting  tasks  which  are  executed  on  the  basis  of  their  priorities.  In  this 
way  we  can  implement  rate-monotonic  style  static  priority  schemes  as  well  as  Earliest- 
Deadline-First  style  dynamic  priority  schemes  within  the  Maruti  framework.  However,  in 
order  to  assure  that  the  tasks  executed  under  priority-based  scheduling  will  continue  to  meet 
their  temporal  requirements,  extensions  to  the  analysis  techniques  are  required.  We  will 
develop  analysis  techniques  suitable  for  this  purpose. 

We  will  extend  the  Maruti  implementation  to  support  non-calendar  schedulers,  such 
as  priority  based  or  earliest-deadline-first  based  schedulers.  These  schedulers  will  run  in 
particular  slots  specified  in  the  Maruti  calendar,  or  when  the  calendar  is  idle. 

POSIX-RT  Subset  API 

In  a  related  area,  we  plan  to  study  the  use  of  a  subset  of  the  POSIX  API  as  the  Maruti  API 
for  soft  and  non-real-time  tasks.  We  will  implement  as  much  of  the  POSIX-RT  API  as  is 
appropriate  and  practicable. 

Asynchronous  Events 

Generally,  in  a  time-based  system,  events  are  polled  for  at  the  maximum  frequency  at 
which  they  are  expected.  This  type  of  event  handling  is  easy  to  analyze  within  the  time- 
based  framework,  and  makes  explicit  the  need  to  reserve  enough  time  to  handle  the  event 
stream  at  its  worst-case  arrival  rate.  At  this  worst-case  rate,  polling  is  more  efficient  than 
interrupt-driven  event  handling  because  the  interrupt  overhead  is  avoided.  However,  at  low 
event  rates,  polling  is  less  efficient  and  fragments  the  cpu  idle  time  (where  we  define  idle 
time  from  the  point  of  view  of  hard  real-time  tasks).  While  conservation  of  idle  time  is  not 
an  issue  for  small  controllers,  it  becomes  very  important  when  there  are  soft-  and  non-real- 
time  tasks  running  in  the  system. 

Currently,  Maruti  takes  the  polling  approach  to  ease  analysis  and  to  better  handle 
the  worst  case  rate.  We  plan  to  study  the  analysis  required  to  accommodate  asynchronous 
events  within  a  calendar  schedule.  Our  intended  approach  is  to  work  with  a  specified 
maximum  frequency,  relative  deadline,  and  computation  time  of  the  asynchronous  event, 
and  to  reserve  enough  time  in  the  calendar  for  the  event  to  occur  at  its  maximum  frequency. 

We  will  extend  the  Maruti  run-time  system  to  register  and  dispatch  event  handlers  in 
response  to  external  events.  Included  in  this  extension  will  be  the  ability  to  detect  and 
appropriately  handle  overload  conditions  (i.e.  when  the  events  occur  more  quickly  that 
expected). 

Multi-Dimensional  Resource  Scheduling  Research 

A  typical  real-time  application  requires  several  resources  for  it  to  execute.  While  CPU  is  the 
most  critical  resource,  others  have  to  be  made  available  in  a  timely  manner.  Generation  of 
schedules  for  multiple  resources  is  known  to  be  a  difficult  problem.  Our  approach  to  date 
has  been  to  develop  efficient  search  techniques,  such  as  one  based  on  simulated  annealing. 

Realistic  problems  contain  a  variety  of  interdependencies  among  tasks  which  must 
be  reflected  as  constraints  in  scheduling.  We  plan  to  develop  efficient  techniques  for 
scheduling  the  allocation  and  deallocation  of  portions  of  multidimensional  resources.  In 
particular,  we  will  address  the  problems  of  allocation  and  management  of  resources  such  as 
memory  and  disk  space,  that  can  accommodate  many  entities  simultaneously. 
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Scheduling  System-Specific  Topologies 

In  a  related  area,  many  communications  networks  have  more  complex  structures  than  a 
simple  bus  and  cannot  be  treated  as  a  single  dedicated  resource.  We  will  study  the 
extension  of  our  scheduling  algorithms  to  support  point-to-point  meshes  of  nodes  (with 
store-and-forward  of  messages),  switched  networks  (such  as  MyriNet),  and  sophisticated 
backplanes  such  as  that  used  in  the  Intel  Paragon. 

We  will  investigate  the  use  of  a  general  framework  for  specifying  the  properties  of 
connection  topologies  to  the  Maruti  scheduler.  In  the  worst  cases,  the  scheduler  for  a 
complex  interconnection  technology  may  have  to  be  programmed  explicitly.  To  handle 
such  cases,  we  will  develop  a  modular  interface  into  our  allocator/scheduler  into  which 
such  backplane-specific  schedulers  can  be  plugged. 

Static  Estimation  of  Execution  Times 

Currently,  execution  times  are  derived  through  extensive  testing  of  the  program  on  the 
target  hardware  environment.  Deriving  the  execution  time  through  static  analysis  is 
hampered  by  the  data  dependencies  present  in  large  number  in  most  programs. 

We  will  investigate  the  use  of  static  analysis  to  help  prove  the  execution  time  limits 
of  programs.  While  generating  a  reasonable  computation  time  estimate  through  static 
analysis  is  not  feasible  in  general,  it  is  possible  to  get  accurate  results  for  large  segments  of 
a  program,  and  to  clearly  identify  the  existing  data  dependencies  so  that  the  programmer 
can-through  program  modifications  or  directives  to  the  analysis  tool-eliminate,  curtail,  or 
characterize  the  data  dependencies  well  enough  to  get  very  useful  verification  of  the  time 
properties  of  the  program. 

Temporal  Debugging 

When  we  develop  real-time  applications  we  need  techniques  for  observing  the  temporal 
behavior  of  programs.  For  their  functional  characteristics  we  can  use  standard  debuggers 
which  permit  the  observation  of  the  state  of  execution  at  any  stage.  This,  however, 
destroys  the  temporal  relationships  completely.  In  Maruti/Virtual  we  provide  the  facilities  of 
controlling  the  execution  of  all  parts  of  an  application  with  respect  to  a  virtual  time  which 
advances  under  the  control  of  keyboard  directives.  Thus. we  can  pause  the  execution  at  any 
virtual  time  instant  with  the  assurance  that  all  temporal  relationships  with  respect  to  this 
instant  are  accurately  reflected  in  the  state  of  the  program.  We  use  the  term  temporal 
debugging  for  this. 

We  will  conduct  research  on  the  theoretical  aspects  of  the  issues  of  temporal 
debugging  and  consider  the  implications  of  temporal  debugging.  In  particular,  we  will 
study  how  the  interactions  of  programs  executing  in  virtual  time  with  external  events  which 
occur  with  respect  to  their  own  time  line  should  be  captured  in  temporal  debugging.  We 
will  also  study  how  the  virtual  times  of  several  nodes  in  a  distributed  environment  should 
be  coordinated. 

We  will  extend  our  implementation  of  temporal  debugging  tools  in  the 
Maruti/Virtual  environment  to  support  temporal  debugging  of  distributed  programs,  and  to 
support  fine  grained  modification  of  the  time  line. 
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Dynamic  Schedule  Generation 


We  will  develop  the  notion  of  time  horizons  to  support  controlled  modifications  of  the  hard 
real-time  calendars  at  runtime  to  support  programs  that  generate  schedules  dynamically. 
While  the  run-time  mechanisms  for  modifying  the  calendars  are  already  implemented 
research  issues  relating  to  finding  safe  points  to  switch  schedules,  and  scheduling  the 
schedulers  themselves,  have  to  be  studied  before  effective  use  can  be  made  of  on-line 
calendar  generation. 

8.2.  Fault  Tolerance 

Maruti  currently  supports  several  powerful  mechanisms  for  building  fault  tolerant 
applications: 

•  Maruti  Configuration  Language  (MCL)  constructs  allow  the  application  designer  to 
specify  replication  of  application  subsystems  with  forkers  and  joiners  inserted  into  the 
communication  streams,  as  well  as  the  allocation  constraints  necessary  to  correctly 
partition  the  replicated  subsystems  for  the  desired  level  of  fault  tolerance. 


•  Maruti  Programming  Language  (MPL)  allows  the  programming  of  application  specific 
fault  tolerance  components  such  as  forkers  and  joiners,  elemental  unit  monitors  and 
channel  monitors. 


•  The  run-time  system  supports  multiple  calendars,  allowing  the  application  to  switch  to 
emergency  or  fault  handling  scenarios  in  real  time. 

We  plan  to  extend  the  existing  mechanisms  by  providing  tools  and  new 
mechanisms  to  better  automate  the  process  of  building  fault  tolerant  applications  The  new 
features  will  include: 


•  A  library  of  forkers  and  joiners  that  can  be  incorporated  into  applications. 

•  Support  for  multicast  messages. 


•  Better  support  in  Maruti  Programming  Language  (MPL)  for  EU  and  channel  monitors. 


*  rePlication  of  subsystems,  and  analysis  of  fault  tolerance  properties  through 

MAGIC,  the  graphical  integrator  described  below. 

8.3.  Clock  Synchronization 

Currently,  distributed  Maruti  handles  clock  drift  at  boot-up  time,  and  thereafter  time  slave 
nodes  simply  adopt  the  time-master's  clock  periodically.  This  scheme  is  suitable  for  many 
applications,  but  is  not  ideal  for  embedded  control  systems  that  will  suffer  from  a 
discontinuous  time  jump. 

To  address  this  problem  we  plan  to  develop  and  implement  time-synchronization 
algorithms  that  operate  concurrently  with  the  distributed  real-time  program  to  continually 
adjust  the  clocks  on  all  the  nodes,  taking  into  account  changes  in  their  relative  drift.  This 
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will  most  likely  involve  a  regular  time  pulse  from  a  master  clock,  from  which  the  other 
nodes  continually  measure  their  drift  and  fine-tune  their  tick  rates.  Since  the  clock  drifts  are 
about  one  order  of  magnitude  less  than  the  communication  latency  variances,  a  simple 
algorithm  will  not  suffice  here. 

8.4.  Heterogeneous  Operation 

We  will  extend  our  communications  agents  and  boot  protocol  to  translate  typed  Maruti 
messages  between  heterogeneous  hosts  when  needed.  The  off-line  Maruti  analysis  tools 
already  collect  information  on  the  types  of  the  channel  endpoints  for  type-checking  the 
connection.  We  will  carry  this  information  through  to  the  run-time  system  for  use  in  those 
channels  that  are  connected  between  heterogeneous  nodes. 

8.5.  MPL/Ada 

We  will  incorporate  Maruti  Programming  Language  (MPL)  features  and  analysis  into  the 
Ada  95  programming  language  as  we  did  for  ANSI  C  in  the  current  MPL,  which  we  will 
now  refer  to  as  MPL/C.  Implementing  MPL/Ada  will  involve  the  following  tasks: 

•  A  detailed  design  review  studying  those  features  of  Ada  which  are  compatible  with 
Maruti  and  those  that  are  not,  and  how  best  to  proceed  with  the  implementation  of 
MPL/Ada. 

•  Port  GNU  Ada  (GNAT)  to  our  NetBSD  development  environment. 

•  Implement  as  much  of  the  Ada  run-time  environment  as  is  practicable  on  the  Maruti 
run-time. 


•  Install  hooks  into  GNAT  to  extract  the  resource  usage  information  we  need.  We  expect 
this  work  will  leverage  heavily  from  the  MPL/C  work,  as  GNAT  is  derived  from  the 
same  back-end  code  base  as  GNU  C. 


•  Develop  and  enforce  within  GNAT  those  restrictions  on  Ada  constructs  needed  in  order 
to  preserve  the  properties  needed  for  our  hard  real-time  analysis. 


•  Add  support  for  Maruti  primitives  to  the  language.  Some  Maruti  primitives  might  be 
implementable  directly  through  existing  Ada  facilities  and  thus  will  not  require  language 
extensions. 
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Figure  4:  Prototype  Graphical  Program  Integrator  Tool. 


8.6.  Graphical  Tools 

Graphical  Program  Development  Tools 

Currently,  Maruti  applications  are  pulled  together  by  an  MCL  specification,  which  takes  the 
form  of  a  procedural  language  whose  primitive  operations  instantiate  and  bind  together  the 
parts  of  the  application.  This  type  of  specification  language  is  complete,  allowing  the 
specification  of  large,  complex  applications  connected  in  arbitrary  ways.  However,  such 
completeness  makes  MCL  relatively  low-level  and  tedious  to  program. 

We  are  developing  graphical  program  development  tools  which  allow  the 
application  designer  to  pull  together  the  modules  using  an  entirely  graphical  user  interface¬ 
avoiding  MCL  programming.  The  on-screen  representation  of  modules  can  be 
interconnected  with  channels  and  grouped  into  hierarchical  subsystems.  The  application 
designer  will  be  able  to  zoom  in  and  out  to  view  the  application  at  several  levels. 

The  graphical  environment  will  allow  both  the  integration  of  existing  modules  and 
the  development  of  the  interfaces  of  modules  that  have  not  yet  been  written.  The  tools  will 
generate  template  MPL  code  for  those  modules.  In  this  way  the  graphical  environment 
functions  as  a  design  tool  and  program  generator  as  well  as  an  integration  environment 

The  graphical  environment  will  have  fault-tolerance  analysis  built  into  it.  Single 
points  of  failure  will  be  identified  on-screen.  The  user  will  be  able  to  replicate  entire 
subsystems  at  once,  with  the  forker  and  joiner  modules  and  allocation  constraints 
introduced  into  the  application  automatically  by  the  system. 

This  graphical  style  of  application  integration  will  greatly  facilitate  the  building  and 
deployment  of  reusable  software  components  modules  built  to  be  easily  customized  and 
reintegrated  into  many  applications.  Given  a  suitable  library  of  reusable  component 
modules  and  the  graphical  integrator,  it  will  be  possible  for  non-programmers  to  build  large 
custom  applications  from  these  parts. 
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Figure  5:  Prototype  Graphical  Resource  Scheduling  Tool 
Graphical  Resource  Management  Tools 

Along  with  the  graphical  software  development  tools,  we  are  pursuing  graphical  resource 
management  tools.  These  are  a  non-programmer's  interface  into  the  advanced  Maruti 
scheduling  technology.  The  Maruti  allocator/scheduler  works  with  the  abstract  concepts  of 
schedulable  entities,  available  resources,  and  various  types  of  constraints  on  the  placement 
of  entities  and  resources.  In  the  Maruti  operating  system,  the  scheduling  entities  are  EUs, 
and  the  resources  are  CPUs,  network,  memory,  and  devices  -  but  in  fact  any  type  of  entity 
or  resource  can  be  manipulated  by  the  allocator/scheduler. 

A  graphical  resource  management  tool  will  allow  the  specification  of  these  entities, 
resources,  and  constraints  on  screen  in  a  way  more  oriented  towards  the  general  user.  With 
this  tool  users  should  be  able  to  use  Maruti  scheduling  technology  to  schedule  classes, 
busses,  or  projects,  for  example. 

We  have  built  a  small  prototype  of  the  graphical  resource  manager.  The  prototype 
displays  the  EU  graph  input  to  the  scheduler  as  well  as  the  calendar  output  of  the  scheduler. 
The  user  can  edit  the  EU  graph  and  its  constraints  and  reschedule  with  die  click  of  a  button. 
The  resulting  resource  calendar  is  redisplayed. 


9.  Availability 

We  are  pleased  to  announce  the  availability  of  the  Maruti  3.1  Hard  Real-Time  Operating 
System  and  Development  Environment. 

With  Maruti  3.1,  we  are  entering  a  new  phase  of  our  project  We  have  an  operating 
system  suitable  for  field  use  by  a  wider  range  of  users,  and  we  are  embarking  on  the 
integration  of  our  time-based,  hard  real-time  technology  with  industry  standards  and  more 
traditional  event-based  soft-  and  non-real-time  systems.  For  this,  we  are  greatly  interested 
in  the  feedback  from  users  as  to  the  direction  of  evolution  of  the  system. 
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For  the  Maruti  3.1  project,  we  will  be  pursuing  the  integration  of  a  POSIX  interface 
for  soft  and  non-real-time  applications,  the  use  of  Ada  for  Maruti  programming,  support 
for  asynchronous  events  and  soft/non-real  time  schedulers  within  the  time-based 
framework,  and  heterogeneous  Maruti  networks. 

For  this  user-oriented  phase  of  the  project  we  will  be  making  regular  releases  of  our 
software  available  to  allow  interested  parties  to  track  and  influence  our  development.  To 
begin  this  phase  we  are  making  our  current  base  hard  real-time  operating  system  and  its 
development  environment  available.  This  is  an  initial  test  release. 

Maruti  3.1  will  be  made  available  to  interested  parties  on  request,  via  Internet  ftp. 
Please  send  electronic  mail  to  maruti-dist@cs.umd.edu  for  details.  More  information  about 
the  Maruti  Project,  as  well  as  papers  and  documentation,  are  available  via  the  World  Wide 
Web  at 

http://www.cs.umd.edU/projects/maruti/l 

9.1.  Runtime  System 

The  Maruti  3.1  embeddable  hard  real-time  runtime  system  for  distributed  and  single-node 
systems  includes  the  following  features: 

•  The  core  Maruti  runtime  system  is  small  - 16  KB  code  for  the  single  node  core,  30  KB 
code  for  the  distributed  core. 

•  The  core  provides  a  calendar-based  scheduler,  threads,  distributed  message  passing 
using  Time  Division  Multiplexed  Access  (TDMA)  over  the  network,  and  tight  time 
synchronization  between  network  nodes. 

•  Also  included  in  the  runtime  system  is  a  graphics  library  suitable  for  system  monitoring 
displays  as  well  as  simulations. 

•  Maruti  runs  on  PC-AT  compatible  computers  using  the  Intel  i386  (with  i387 
coprocessor),  i486DX,  or  Pentium  processors.  Distributed  operation  currently  requires 
a  3Com  3c507  ethemet  card.  The  graphics  library  supports  standard  VGA  and  Tseng- 
Labs  ET-4000-based  Super-VGA  Support  for  other  SVGA  chipsets  is  forthcoming 
soon. 

9.2.  Development  Environment 

Maruti  3.1  includes  a  complete  development  environment  for  distributed  embedded  hard 
real-time  applications.  The  development  environment  runs  on  NetBSD  Unix  and  includes 
the  following: 

•  d  The  Maruti/Virtual  debugging  environment  -  simulates  the  Maruti  runtime  system 
within  the  development  environment.  The  system  clock  in  this  environment  tracks 
virtual  time,  which  can  be  sped  up,  slowed  down  in  relation  to  the  actual  time,  or 
single-stepped  or  stopped.  This  allows  temporal  debugging  of  the  application.  Within 
Maruti/Virtual  traces  of  the  application  scheduling  and  network  traffic  can  be  monitored 
in  the  debugging  session. 
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•  The  ANSI-C  based  Maruti  Programming  Language  (MPL/C).  MPL  adds  modules, 
message  passing  primitives,  shared  memory,  periodic  functions,  message-invoked 
functions,  and  exclusion  regions  to  ANSI  C.  MPL  is  processed  by  a  version  of  the 
GNU  C  compiler  which  has  been  modified  to  recognize  the  new  MPL  features,  and  to 
output  information  about  the  resources  used  by  the  MPL  program. 

•  The  Maruti  Configuration  Language  (MCL).  MCL  allows  the  system  designer  to 
specify  the  placement,  timing  constraints,  and  interconnections  of  ail  the  modules  in  an 
application.  MCL  is  a  powerful  interpreted  C-like  language,  allowing  complex, 
hierarchical  configuration  specifications,  including  replication  of  components  and 
installation-site  specific  sizing  of  the  application.  The  MCL  processor  analyses  the 
application  graph  for  completeness,  and  type-checks  all  connections. 

•  The  Maruti  Allocator/Scheduler.  The  Maruti  allocation  and  schedulingtool  analyses  the 
information  generated  by  the  MPL  compiler  and  the  MCL  integrator  to  find  an 
allocation  and  scheduling  of  the  tasks  of  a  distributed  application  across  the  nodes  of  a 
Maruti  network.  AD  relative  and  global  timing,  exclusion,  and  precedence  constraints 
are  taken  into  account  in  finding  a  schedule,  as  are  the  network  speed  and  scheduling 
parameters. 

•  The  Maruti  Timing  Trace  Analyzer.  The  Timing  Analyzer  calculates  worst-case 
computation  times  from  timing  files  output  by  the  runtime  system.  Computation  times 
are  calculated  for  each  scheduling  unit  in  the  application,  and  these  times  can  be  fed 
back  into  the  Allocator/Scheduler  for  more  precise  scheduling  analysis. 

•  The  Maruti  Runtime  Binder  (mbind).  One  of  the  features  of  Maruti  is  the  late  binding 
of  an  application  to  a  particular  runtime  system.  The  same  application  binaries  can  be 
combined  with  different  system  libraries  to  build  a  binary  customized  for  a  particular 
application  in  a  particular  setting.  Only  those  portions  of  the  system  library  needed  for 
that  binding  are  included.  Mbind  manages  this  final  step. 

•  The  Maruti  Application  Builder  (mbuild).  Mbuild  automates  the  process  of  building  an 
application  by  generating  for  the  programmer  a  customizable  makefile  that  manages  the 
complete  process  of  compiling,  configuring,  scheduling,  and  binding  an  application. 
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